I am using the ydata-profiling library to generate profile reports of my pandas DataFrame. I would like to save the entire ProfileReport object, so I can load it later without having to regenerate the report.
Questions:
Is there a way to serialize the entire ProfileReport object for later use? If not, is there a recommended way to save the generated profile (maybe as JSON or another format) and later load it back as a ProfileReport object, so I can manipulate or view it?
I tried serializing the object using pickle, but encountered issues as it appears some internal components of the report are not serializable.
I also considered saving the report as a JSON and loading it back, but there doesn't seem to be a direct method to convert a saved JSON report back into a ProfileReport object.
Here's a simple setup for context:
profile = ProfileReport(df)
Saving the report as JSON
profile.to_file("report.json")
Attempt to load the report from JSON
(This will throw an error because from_json is not a recognized method)
profile_json = ProfileReport.from_json("report.json")
This line will cause an error
profile_json.to_file("profile_json.html")
Serializing the profile report using pickle
output_path = "profile.pkl"
with open(output_path, 'wb') as ppfile:
pickle.dump(profile, ppfile)
Deserializing the profile report from pickled file:
with open(output_path, 'rb') as ppfile:
profile_pp = pickle.load(ppfile)
profile_pp.to_file("profile_pp.html")
Actually it is possible to save and load with existing dump and load methods. But cant use it for compare because
if not all(is_df_available): --> 187 raise ValueError("Reports where not initialized with a DataFrame.")