could someone help me extract the best performing model's parameters from my grid search? It's a blank dictionary for some reason.
from pyspark.ml.tuning import ParamGridBuilder, TrainValidationSplit, CrossValidator
from pyspark.ml.evaluation import BinaryClassificationEvaluator
train, test = df.randomSplit([0.66, 0.34], seed=12345)
paramGrid = (ParamGridBuilder()
.addGrid(lr.regParam, [0.01,0.1])
.addGrid(lr.elasticNetParam, [1.0,])
.addGrid(lr.maxIter, [3,])
.build())
evaluator = BinaryClassificationEvaluator(rawPredictionCol="rawPrediction",labelCol="buy")
evaluator.setMetricName('areaUnderROC')
cv = CrossValidator(estimator=pipeline,
estimatorParamMaps=paramGrid,
evaluator=evaluator,
numFolds=2)
cvModel = cv.fit(train)
> print(cvModel.bestModel) #it looks like I have a valid bestModel
PipelineModel_406e9483e92ebda90524 In [8]:
> cvModel.bestModel.extractParamMap() #fails
{} In [9]:
> cvModel.bestModel.getRegParam() #also fails
>
> AttributeError Traceback (most recent call
> last) <ipython-input-9-747196173391> in <module>()
> ----> 1 cvModel.bestModel.getRegParam()
>
> AttributeError: 'PipelineModel' object has no attribute 'getRegParam'
There are two different problems here:
Estiamtors
orTransformers
notPipelineModel
. All models can be accessed usingstages
property.Params
at all (SPARK-10931).So unless you use development branch you have to find the model of interest among branches, access its
_java_obj
and get parameters of interest. For example: