Optimal approach for anomaly detection using One-Class SVM with multiple location IDs

18 Views Asked by At

I'm currently developing an unsupervised anomaly detection system using One-Class SVM, where my dataset includes the following features: date, location ID, daily numbers, day of the week, and a binary indicator for holidays. I've experimented with two approaches:

  1. Model 1: I train separate One-Class SVM models for each location ID.
  2. Model 2: I combine all location IDs into one model.

However, I've observed that Model 2 fails to detect certain obvious anomalies that Model 1 captures effectively. I'm seeking advice on the optimal approach and additional steps to analyse why Model 2 fails to detect these anomalies.

I used these features for training the data as I combined all of the locationID in one model. I have also tried excluded the locationID but same issue happened.

numerical_features = ['daily_numbers']
categorical_features = ['locationID','dayofweek','nathol']

The model doesn't detect the obvious anomaly on Saturday on this location ID 8

The model doesn't detect the obvious anomaly on Wednesday on this location ID 11

To add context to the datasets, I also notice the daily number for this location ID 8 is low compared to other locations, does this cause the model unable to detect the obvious anomaly for this location?

Average daily numbers of each location ID

Should I continue with separate models for each location ID, or is there a better way to combine them? Furthermore, what techniques can I use to understand the reasons behind the model's anomaly detection performance? Any insights or suggestions would be greatly appreciated. Thank you!

0

There are 0 best solutions below