How to split train/val/test based on date on Databricks AutoML?

65 Views Asked by At

I'm trying to run databricks automl experiments using a train/val/test split that uses the oldest data for train, next oldest data for val, and newest data for test. I know databricks automl takes a keyword argument called "time_col" to do this automatically, but the issue is that it's using that column to train the model (even when I include that in the list for the keyword argument "exclude_cols".

I know I can manually change the train/val/test sets after the automl experiment is complete, but it doesn't really help because the original datasets were used to do the hyperparameter tuning.

I'm just curious if there is a way around this or if anyone has a recommendation on another automl tool that they prefer that works well with databricks.

Thank you!

0

There are 0 best solutions below