Efficiently loading large time series data into InfluxDB

1.1k Views Asked by AbdelKh At 29 July 2025 at 04:25

I am trying to load 100 billion (thousands of columns, millions of rows) multi-dimensional time series datapoints into InfluxDB from a CSV file.

I am currently doing it through line protocol as follows (my codebase is in Python):

f = open(args.file, "r")
l = []
bucket_size = 100
if rows > 10000:
    bucket_size = 10
for x in tqdm(range(rows)):
    s = f.readline()[:-1].split(" ")
    v = {}
    for y in range(columns):
        v["dim" + str(y)] = float(s[y + 1])
    time = (get_datetime(s[0])[0] - datetime(1970, 1, 1)).total_seconds() * 1000000000
    time = int(time)
    body = {"measurement": "puncte", "time": time, "fields": v }
    l.append(body)
    if len(l) == bucket_size:
        while True:
            try:
                client.write_points(l)
            except influxdb.exceptions.InfluxDBServerError:
                continue
            break
        l = []
client.write_points(l)

final_time = datetime.now()
final_size = get_size()

seconds = (final_time - initial_time).total_seconds()

As the code above shows, my code is reading the dataset CSV file and preparing batches of 10000 data points, then sending the datapoints using client.write_points(l).

However, this method is not very efficient. In fact, I am trying to load 100 billion data points and this is taking way longer than expected, loading only 3 Million rows with 100 columns each has been running for 29 hours and still has 991 hours to finish!!!!

I am certain there is a better way to load the dataset into InfluxDB. Any suggestions for faster data loading?

Original Q&A

There are 1 best solutions below

valyala On 26 September 2021 at 09:16

Try loading the data in multiple concurrent threads. This should give a speedup on multi-CPU machines.

Another option is to feed the CSV file directly to time series database without additional transformations. See this example.

Efficiently loading large time series data into InfluxDB

There are 1 best solutions below

Related Questions in INFLUXDB

Related Questions in INFLUXDB-PYTHON

Related Questions in INFLUXDB-2

Related Questions in FLUX-INFLUXDB

Related Questions in INFLUX-LINE-PROTOCOL

Trending Questions

Popular # Hahtags

Popular Questions