Manufacturing and Service Operations Management Conference

Short-lived high-volume bandits

Jia Su¹, Ian Anderson², Paul Duff², Andrew Li³

¹Cornell University; ²Glance; ³Carnegie Mellon University

TBD

Markovian interference in experiments

Andrew Zheng¹, Vivek Farias¹, Andrew Li², Tianyi Peng¹

¹MIT; ²Carnegie Mellon University

TBD

Diffusion limits of multi-armed bandit experiments under optimism-based policies

Anand Kalvit, Assaf Zeevi

Columbia Business School, United States of America

Our work provides new results on the arm-sampling behavior of the celebrated UCB family of multi-armed bandit algorithms, leading to several important insights. Among these, it is shown that arm-sampling rates under UCB are asymptotically deterministic, regardless of the problem complexity. This discovery facilitates new sharp asymptotic characterizations revealing profound distinctions between UCB and Thompson Sampling such as an "incomplete learning" phenomenon characteristic of the latter.

Conference Agenda