Session | ||
SA1 - AI1: Online leaning
| ||
Presentations | ||
Regret minimization with dynamic benchmarks in repeated games 1Stanford University; 2Microsoft Research In repeated games, strategies are often evaluated by their ability to guarantee the performance of the single best action that is selected in hindsight. Yet, the efficacy of the single best action as a benchmark is limited, as static actions may perform poorly in common dynamic settings. We propose the notion of dynamic benchmark (DB) consistency and we characterize the possible empirical joint distributions of play that may emerge when all players are relying on DB consistent strategies. Learning to ask the right questions: a multi-armed bandits approach 1Northwestern University; 2Tata Institute of Fundamental Reseaarch; 3Columbia University TBD |