I have a folder with about 5000 csvs with 60,000 rows and 10 columns in each. I wish to obtain mean of each csv and append it to a list. Current code below:
mean_list = []
for item in train_frags: # train_frags is a variable that holds location of all files
segment = pd.read_csv(item,dtype = 'Int16')
mean_list.append(segment.mean())
This code has been running for more than 10 minutes now!!. Please suggest an efficient version.
Only real thing I can think of is that you are using append, which is pretty slow, so maybe try to convert it to list comprehension if possible. The only other thing I can think of is to use numba (https://numba.pydata.org/), which could help with performance and speed.