I'm trying the MLP Classifier algorithm to give me insights into analyzing the past 666 results from drawing a single ball from 36 different urns (each urn has 10 balls, numbered 0 to 9) but when I asked the algorithm to give me the expected probabilities for each ball for the next drawing it comes up with extreme values. Here is the code:
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import StandardScaler
# Assuming your data is stored in the 'results' list
# Each sublist represents the past results for an urn
results = [
[3, 2, 7, ..., 6, 9], # Urn 1
[5, 2, ..., 0], # Urn 2
# ... (other urns)
[4, 2, 7, ..., 1, 1] # Urn 36
]
# Encode the past results as binary vectors
encoder = OneHotEncoder(categories=[range(10)] * 666, sparse=False)
X = np.vstack([encoder.fit_transform([urn]) for urn in results])
# Use the last ball drawn in each urn as the label
labels = np.array([urn[-1] for urn in results])
# Normalize the features
scaler = StandardScaler()
X_normalized = scaler.fit_transform(X)
# Initialize the MLP classifier with increased complexity
mlp = MLPClassifier(hidden_layer_sizes=(100, 100, 100), max_iter=1000, random_state=42)
# Train the model
mlp.fit(X_normalized, labels)
# Predict probabilities for each ball in the next drawing for all urns
for urn_idx, urn_results in enumerate(results):
next_drawing_probs = mlp.predict_proba(scaler.transform(encoder.transform([urn_results])))
print(f"Urn {urn_idx + 1} probabilities:")
for ball, prob in enumerate(next_drawing_probs):
print(f" Ball {ball}: {prob:.4f}")
#And here is the output for the last 2 urns (which are like the output for all the other 34 urns):
Urn 35 probabilities:
Ball 0: 0.0000
Ball 1: 0.0000
Ball 2: 0.0000
Ball 3: 0.0000
Ball 4: 0.0001
Ball 5: 0.9999
Ball 6: 0.0000
Ball 7: 0.0000
Ball 8: 0.0000
Ball 9: 0.0000
Urn 36 probabilities:
Ball 0: 0.0000
Ball 1: 0.0000
Ball 2: 0.0000
Ball 3: 0.0000
Ball 4: 0.0000
Ball 5: 0.0000
Ball 6: 0.0000
Ball 7: 0.0000
Ball 8: 0.0000
Ball 9: 0.9999
Process finished with exit code 0
I have tried to increase the number of layers (up to 1000) but with the same results. I expect the probabilities for each ball to be closer to 0.1 (as low as 0.075 and as high as 0.125 perhaps)