Abstract
The single decision maker chooses one of the actions repeatedly. She chooses the action with the highest weighted average of the past payoffs. In the long run either the action with highest expected payoff or the action with highest minimal payoff is chosen depending on how weights evolve.
Original language | English |
---|---|
Pages (from-to) | 303-305 |
Number of pages | 3 |
Journal | Economics Letters |
Volume | 117 |
Issue number | 1 |
DOIs | |
Publication status | Published - Oct 2012 |
Keywords
- Adaptive learning
- Constrained memory
- Bandit problems
- SIMPLE DYNAMIC-MODEL
- CHOICE