Network Flow Dataframe - Merging Memory Error - Unable to allocate array with shape and data type

225 Views Asked by At

I have big 3 CSV files and they are all 76 same columns. The number of rows are different 17809 rows - 124262 rows - 108779 rows I am trying to merge these 3 data frames but I am having a memory error. Can I solve this issue or is it impossible for my hardware? 16GB Ram, i5 11th.

I found this solution to merge them but there is an error. I want them to be in one dataframe.

  from functools import reduce
    data_frames = [a, b, c]
    df_merged = reduce(lambda  left,right: pd.merge(left,right,on=['Intrusion'], how='outer'), data_frames)
    df_merged

MemoryError: Unable to allocate 101. GiB for an array with shape (13517346950,) and data type int64

1

There are 1 best solutions below

3
On

The answer was in Linux, I loved it. awk 'FNR > 1' file1.csv file2.csv> output.csv That is all. https://predictivehacks.com/?all-tips=how-to-concatenate-multiple-csv-files-in-linux