I'm following this example from tsfresh: Multiclass . It is a classification example using feature extraction and a decision tree classifier.
import matplotlib.pylab as plt
from tsfresh import extract_features, extract_relevant_features, select_features
from tsfresh.utilities.dataframe_functions import impute
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import pandas as pd
import numpy as np
from tsfresh.examples.har_dataset import download_har_dataset, load_har_dataset, load_har_classes
download_har_dataset()
df = load_har_dataset()
y = load_har_classes()
df["id"] = df.index
df = df.melt(id_vars="id", var_name="time").sort_values(["id", "time"]).reset_index(drop=True)
X = extract_features(df[df["id"] < 500], column_id="id", column_sort="time", impute_function=impute)
X_train, X_test, y_train, y_test = train_test_split(X, y[:500], test_size=.2)
classifier_full = DecisionTreeClassifier()
classifier_full.fit(X_train, y_train)
Now I am trying to visualize the classification report using:
from sklearn.model_selection import TimeSeriesSplit
from sklearn.naive_bayes import GaussianNB
from yellowbrick.datasets import load_occupancy
from yellowbrick.classifier import classification_report
classes=np.unique(y)
classes=classes.tolist()
classes=list(map(str,classes))
visualizer = classification_report(GaussianNB(), X_train, y_train, X_test, y_test, classes=classes, support=True)
However when running the script, it gives:
ModelError: could not decode [1 2 3 4 5 6] y values to [1 2 3 4 5 6] labels
Anyone knows why it happens?
I tried comparing with this example: Scikit classification example, where I found the classification_report, and I think the problem is in the data structure, but I can't find any difference? Any help is appreciated. Thank you!