I am attempting to construct a "big data" approach for time series classification using dynamic time warping (DTW) and knn clustering.
https://archive.ics.uci.edu/ml/datasets/Heterogeneity+Activity+Recognition
I am using this dataset and have done the pre-processing (zscore, feature extraction) resulting in the dataframe attached
Each user-gt (ground truth) category has been separated to capture the different time series per user gt group.
I am now at the stage of applying DTW to the data. I am unsure how to do so, I think I am missing some understanding. I don't get how DTW of all pairs will help in training as I think the labels will get lost. Oh I forgot to say I am using Databricks with Apache Spark