How do I save a TFDV stats in the correct format for them to be loaded back in?

Question

How do I save a TFDV stats in the correct format for them to be loaded back in?

756 Views Asked by Stefan Krawczyk At 06 June 2025 at 22:10

It is puzzling to me that there is a tfdv.load_statistics() function, but no corresponding tfdv.write_statistics() function. How do I go about saving the statistics, and then loading them again?

e.g.

import tensorflow_data_validation as tfdv
stats = tfdv.generate_statistics_from_dataframe(df)

# how do I save?


# load back for later use
saved_stats = tfdv.load_statistics('saved_stats.stats')

I can save the string representation to a file, but this is not the format that load_statistics expects.

with open('saved_stats.stats', 'w') as o:
    o.write(str(stats))

Pointers anyone?

Original Q&A

There are 4 best solutions below

**Amine_h** · Answer 1

Amine_h On 08 October 2020 at 13:52

have you tried this : tfdv.utils.stats_util.write_stats_text ?

**Pritam Dodeja** · Answer 2

Pritam Dodeja On 23 January 2023 at 08:27

There's a function called tfdv.load_stats_binary that you can use to solve this problem.

**Stefan Krawczyk** · Answer 3

Okay figure out this hacky way to do it.

df = ... # create pandas df
from tensorflow_metadata.proto.v0 import statistics_pb2
import tensorflow_data_validation as tfdv
stats = tfdv.generate_statistics_from_dataframe(df)

# save it
with open('saved_stats.stats', 'wb') as o:
    o.write(stats.SerializeToString())

# load back for later use
with open('saved_stats.stats', 'rb') as i:
    loaded_stats = statistics_pb2.FromString(i.read())

**pvasek** · Answer 4

In the current tfdv version 1.3.0 there are the following methods that can be used:

Example:

import tensorflow_data_validation as tfdv

stats = tfdv.generate_statistics_from_dataframe(df)
stats_path = "my-stats-file.stats"

# saving
tfdv.write_stats_text(stats, stats_path)


# loading
stats = tfdv.load_stats_text(stats_path)

How do I save a TFDV stats in the correct format for them to be loaded back in?

There are 4 best solutions below

Related Questions in TENSORFLOW-DATA-VALIDATION

Trending Questions

Popular # Hahtags

Popular Questions