Finding mean of 5000 different csvs to summarize in one python list

45 Views Asked by At

I have a folder with about 5000 csvs with 60,000 rows and 10 columns in each. I wish to obtain mean of each csv and append it to a list. Current code below:

mean_list = []
   
for item in train_frags: # train_frags is a variable that holds location of all files
    segment = pd.read_csv(item,dtype = 'Int16')
    mean_list.append(segment.mean())

This code has been running for more than 10 minutes now!!. Please suggest an efficient version.

1

There are 1 best solutions below

0
On BEST ANSWER

Only real thing I can think of is that you are using append, which is pretty slow, so maybe try to convert it to list comprehension if possible. The only other thing I can think of is to use numba (https://numba.pydata.org/), which could help with performance and speed.