When I try to fit CatBoostClassifier on GPU with pandas dataframe with embeddings, I get this error:

CatBoostError: Attempt to call single feature writer on packed feature writer

The data type for embedding in the column is object. Catboost version==1.2.2.

When I remove the embedding column from the training, everything works fine, so it's definitely about the embeddings. I tried to store it as a list, as an ndarray, as a tensor. It doesn't affect anything.

I prescribe the embedding_features parameter.
Class imbalance or the existance of eval_set also doesn't affect anything.

Found out that task_type='GPU' is causing this error. Even if you try some simple code from documentation usage examples, but with task_type='GPU':

cat_features = [3]
embedding_features=[0, 1]
train_data = [
    [[0.1, 0.12, 0.33], [1.0, 0.7], 2, "male"],
    [[0.0, 0.8, 0.2], [1.1, 0.2], 1, "female"],
    [[0.2, 0.31, 0.1], [0.3, 0.11], 2, "female"],
    [[0.01, 0.2, 0.9], [0.62, 0.12], 1, "male"]
]
train_labels = [1, 0, 0, 1]
eval_data = [
    [[0.2, 0.1, 0.3], [1.2, 0.3], 1, "female"],
    [[0.33, 0.22, 0.4], [0.98, 0.5], 2, "female"],
    [[0.78, 0.29, 0.67], [0.76, 0.34], 2, "male"],
]

model = CatBoostClassifier(iterations=2,
                           learning_rate=1,
                           depth=2,
                           task_type='GPU')

model.fit(train_data, train_labels, cat_features=cat_features, embedding_features=embedding_features)

preds_class = model.predict(eval_data)

The usage of GPU is essential in this case.

0

There are 0 best solutions below