Classical UCB algorithms

UCB-PI-untuned

UCB1kt=πˉkt+2logtnktUCB-tunedkt=πˉkt+logtnktmin(14,Vkt)\begin{align} \text{UCB}1_{kt}&=\bar{\pi}_{kt}+\sqrt{\frac{2\log t}{n_{kt}}} \\ \text{UCB-tuned}_{kt}&=\bar{\pi}_{kt}+\sqrt{\frac{\log t}{n_{kt}}\mathrm{min}\left(\frac14,\mathrm{V}_{kt}\right)} \end{align}

Next step:

  • prove that regret of UCB-PI is lower than UCB1
  • define a tuned version of the UCB-PI algorithm analogous to Audibert et al. (2009)
Short Summary
Model setup
Modified Algorithms
Some Thoughts
Pricing with Federated Learning
Xuhang Fan, Duke University
Dynamic Online Pricing Using MAB Experiments
11 / 19
2023/01/01