I Can't save ALS Model

379 Views Asked by At
from pyspark.ml.recommendation import ALS, ALSModel
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.mllib.evaluation import RegressionMetrics, RankingMetrics
from pyspark.ml.evaluation import RegressionEvaluator

als = ALS(maxIter=15, 
              regParam=0.08, 
              userCol="ID User", 
              itemCol="ID Film", 
              ratingCol="Rating",
              rank=20,
              numItemBlocks=30,
              numUserBlocks = 30,
              alpha = 0.95,
              nonnegative = True, 
              coldStartStrategy="drop",
             implicitPrefs=False)
model = als.fit(training_dataset)

model.save('model')

everytime i call save method the jupyter notebook gives me similar error

An error occurred while calling o477.save.
: org.apache.spark.SparkException: Job aborted.
    at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:106)

I'm aware of the previous SO question and answer and has tried this:

model.save('model')

.

model.write().save("saved_model")

.

als.write().save("saved_model")

.

als.save('model')

.

import pickle
s = pickle.dumps(als)

.

als_path = "from_C:Folder_to_my_project_root" + "/als"
als.save(als_path)

my question is how to save ALS model so that i can load it without training everytime i run the program

2

There are 2 best solutions below

0
On

Basically, o477 and oXXX error in general means there's error while doing the jobs. since it seems you're doing a movie recommendation, i assume you use movielens or netflix dataset. it can mean one of these:

  1. File is too big and can't pickle
  2. The model is too complex and your memory runs out
0
On

I used to run this problem where i run recommendation for netflix prize dataset with total 100 million records. This is what i did, try to run 50% of the data and slowly add the percentage and see where it breaks. In my case the data slowly add up to 100% of the data. Closing unnecesarry Chrome tab also helps