We present a model of adaptive economic agents who are k periods forward looking. Agents in our model are randomly matched to interact in finitely repeated games. They form beliefs by learning from past behavior of others and then best respond to these beliefs looking k periods ahead. We establish almost sure convergence of our stochastic process and characterize absorbing sets. These can be very different from the predictions in both the fully rational model and the adaptive, but myopic case. In particular we find that also non-nash outcomes can be sustained whenever they satisfy a “local” efficiency condition. We then characterize stochastically stable states in a class of 2 × 2 games and show that under certain conditions the efficient action in prisoner's dilemma games and coordination games can be singled out as uniquely stochastically stable. We show that our results are consistent with typical patterns observed in experiments on finitely repeated prisoner's dilemma games and in particular can explain what is commonly called the “endgame effect” and the “restart effect”. Finally, if populations are composed of some myopic and some forward looking agents, parameter constellations exist such that either might obtain higher average payoffs.