Multi-Armed Bandit Analysis for Price Optimization

5.3k Views Asked by At

Lately, I have read a blog post titled Bandits Know the Best Product Price" (http://pkghosh.wordpress.com/2013/08/25/bandits-know-the-best-product-price/), which outlines how to use multi-armed bandit analysis for price optimization.

There is also a lot of discussion on whether multi-armed bandit analysis is better than A/B testing (e.g. "20 lines of code that will beat A/B testing every time": http://stevehanov.ca/blog/index.php?id=132?utm_medium=referral versus "Why multi-armed bandit algorithm is not 'better' than A/B testing": http://visualwebsiteoptimizer.com/split-testing-blog/multi-armed-bandit-algorithm/).

I am aware that there is a R package called "bandit", which can be used for such an analysis.

Does someone has a toy example - comparable to the one in the blog post - which shows how to apply this method by using R (within the context of price optimization)?

Thanks for your help.

3

There are 3 best solutions below

1
On

My cautious explorations of this topic might be of use to you: http://codeandmath.wordpress.com/2014/04/05/type-i-error-in-bandits/

0
On

I am doing a projects about bandit algorithms recently. Basically, the performance of bandit algorithms is decided greatly by the data set. And it´s very good for continuous testing with churning data. So what you need to do it to test and tune your model on testing data.

For undertanding bandit more, you can read this book, bandit algorithms for website optimization:http://shop.oreilly.com/product/0636920027393.do. It explains the basic bandit algorithms quite well and implements in Python. You can find its code in Github: https://github.com/johnmyleswhite/BanditsBook. However, they didn´t talk about contextual bandits in the book.

For R, I am not that sure. But I just searched online, I found a guy implemented bandits in R, here is the code: https://github.com/lotze/bandit

Hope it can help you.

0
On

I understand you ask for code in R but the implementation are often very very simple. I think this could be relevant. the algorithm works if you replace the binary data with continuous as the Reward is just the mean. So feel free to use the same data (as price) and replace (ones with some random number).