I am working on a multi-label text classification task, where I want to predict the Big Five personality traits of essay writers based on their sentence embeddings. The sentence embeddings are generated by a sentence transformer model from sentence_transformers. The personality traits are cEXT, cNEU, cAGR, cCON, and cOPN, and they are binary labels (1 or 0). I have a dataset of 2467 essays with their corresponding labels.

I am using a MLP classifier from sklearn.neural_network to build and evaluate my model. I use a train-test split technique to split the data into training and testing sets, and I use various metrics such as accuracy, precision, recall, F1-score, and AUC to measure the performance of the classifier.

I want to try different parameter combinations for the MLP classifier, such as hidden_layer_sizes, activation, solver, alpha, etc., and see how they affect the performance of the model. I use a for loop to iterate over all possible combinations of these parameters, and print the evaluation metrics for each combination.

However, when I run my code, I get very low and inconsistent results for the accuracy metric. For example, the accuracy ranges from 0.05 to 0.06, which is much lower than the other metrics that range from 0.58 to 0.60. I also get a warning that says Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.

I don't understand why my model has such low accuracy. Is there something wrong with my code or my data? How can I improve my model's accuracy? What are the best parameter values for my task? Any help or suggestions would be appreciated.

My data looks like this:

           #AUTHID                                               TEXT cEXT  \
0  1997_504851.txt  Well, right now I just woke up from a mid-day ...    n   
1  1997_605191.txt  Well, here we go with the stream of consciousn...    n   
2  1997_687252.txt  An open keyboard and buttons to push. The thin...    n   
3  1997_568848.txt  I can't believe it!  It's really happening!  M...    y   
4  1997_688160.txt  Well, here I go with the good old stream of co...    y   

0    y    y    n    y  
1    n    y    n    n  
2    y    n    y    y  
3    n    y    y    n  
4    n    y    n    y  

Here is the code I'm

# Import necessary libraries
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
import numpy as np
import itertools

# Define the target matrix
Y = df.iloc[:, 2:].to_numpy()

# Convert 'y' labels to 1 and 'n' labels to 0
Y = np.where(Y == 'y', 1, 0)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(embeddings, Y, test_size=0.2, random_state=42)

# Define the parameter values to iterate over
hidden_layer_sizes_values = [(100,), (200,), (300,), (100, 100)]
activation_values = ['relu', 'tanh', 'logistic']
solver_values = ['adam', 'sgd']
alpha_values = [0.0001, 0.001, 0.01, 0.1]

# Iterate over parameter combinations
for hidden_layer_size, activation, solver, alpha in itertools.product(hidden_layer_sizes_values, activation_values, solver_values, alpha_values):
    print(f"Running with hidden_layer_sizes={hidden_layer_size}, activation={activation}, solver={solver}, alpha={alpha}")

    # Create an instance of MLPClassifier with the current parameter combination
    mlp = MLPClassifier(

    # Fit the classifier on the training data
    mlp.fit(X_train, y_train)

    # Predict the labels of the testing data
    y_pred = mlp.predict(X_test)

    # Evaluate the performance of the classifier using metrics
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred, average='macro')
    recall = recall_score(y_test, y_pred, average='macro')
    f1 = f1_score(y_test, y_pred, average='macro')

    # Print the evaluation metrics for the current parameter combination
    print('Evaluation Metrics:')
    print('Accuracy:', accuracy)
    print('Precision:', precision)
    print('Recall:', recall)
    print('F1-score:', f1)


