gplearn library for generating new lines of data from given dataset

625 Views Asked by At

I am using gplearn library (genetic programming) for generating new rules from the given dataset. Currently I have 11 rows of data with 24 columns(features) that I give as input to the SymbolicRegressor method to get new rules. However, I am only getting only one rule. Generally with crossover shouldn't I get 11 new rules if I give 11 lines of data as input. If I doing it wrong what is the right way of doing it ?

import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import ExtraTreesRegressor
from gplearn.genetic import SymbolicRegressor

data = pd.read_csv("D:/Subjects/Thesis/snort_rules/ransomware_dataset.csv")

x_train = data.iloc[:,0:23]
y_train = data.iloc[:,:-1]

gp = SymbolicRegressor(population_size=11,
                           generations=2, stopping_criteria=0.01,
                           p_crossover=0.8, p_subtree_mutation=0.1,
                           p_hoist_mutation=0.05, p_point_mutation=0.05,
                           max_samples=0.9, verbose=1,
                           parsimony_coefficient=0.01, random_state=0)

gp.fit(x_train, y_train)
print(gp._program)

The output is :

X7/(X15*(-X16*X20 - X19 + X2))

0

There are 0 best solutions below