How to ensure the output of _best_programs of SymbolicTransformer of gplearn is different?

272 Views Asked by At

I am using the SymbolicTransformer of gplearn to generate some automated features. The issue is, when I inspect the expression of the features via looking at _best_programs after fitting, I find that most of the features have the same expression. I am wondering whether there is a way to ensure that we output different features using SymbolicTransformer after fitting?

1

There are 1 best solutions below

0
On

I don't know if there is a way to explicitly enforce this but you can probably try to enforce more diverse populations each generation in the hopes that this leads to a a collection of more diverse _best_programs. In my opinion a few parameters you could look into are:

  • p_crossover
  • p_subtree_mutation
  • p_hoise_mutation
  • p_point_mutation
  • p_point_replace

If you increase the chance of crossover or mutation you will increase your expected diversity but you must not overdue it. There is a balance between a diverse population and an accurate one. The higher the crossover or mutation the more likely that you will take a strong individual candidate and change it into something meaningless.