Proper RLHF surely boosts "predicted next token until it couldn't" to feel more like "actually recalled".