I am currently building a contextual bandit. Since I neet to get the top 4 actions I am using the conditional contextual bandit in the Vowpal Wabbit framework.
I like to warm start my model but as far as I understand the framework, this is only possible with the classic contextual bandit. Is there a way to warm start the ccb_explore_adf model for online learning? What are the necessary commands and how does the dataformat for the training look like?
As I understand the ccb model that should also be (somehow) possible, since it is based on the cb model.