Python export to .csv without overwriting columns in for loop

180 Views Asked by David At 29 July 2025 at 11:54

I am trying to write data from several documents (implemented in a a for loop) to a csv file in Python 3. However, the column gets overwritten every time. How can I make that data from the individual documents be printed on a csv in the rows below, without overwriting?

from pdfminer.high_level import extract_text
for selectedfile in glob.glob(r'C:\Users\...\*.pdf'):
    text = extract_text(selectedfile)

Y = set(text)
Z = []
Znew = []
for val in Y:
    occurrences = wordlist2.count(val)
    if occurrences > 50:  # define min. no. of occurrences
        # print(val, ':', occurrences)
        Z.append(val)
        Znew.append(occurrences)

dict = {'Stem': Z, 'Count': Znew}
df = pd.DataFrame(dict)
df.to_csv('Exported list.csv', header=True, index=True, encoding='utf-8')

Original Q&A

There are 1 best solutions below

tdelaney On 20 November 2022 at 16:55

The problem is in that first for loop. You keep replacing text with new extracted text and only process the final extraction. You could move the processing into the for loop to work on each extraction. In this example, I've opened the file beforehand and written the header once. Then its a question of making sure the index is correct for each write.

from pdfminer.high_level import extract_text
import pandas as pd
import numpy as np

with open('Exported list.csv', 'w', encoding='utf-8') as outfile:
    outfile.write(",Stem,Count\n") # header
    base = 0
    for selectedfile in glob.glob(r'C:\Users\...\*.pdf'):
        text = extract_text(selectedfile)

        Y = set(text)
        Z = []
        Znew = []
        for val in Y:
            occurrences = wordlist2.count(val)
            if occurrences > 50:  # define min. no. of occurrences
                # print(val, ':', occurrences)
                Z.append(val)
                Znew.append(occurrences)

        dict = {'Stem': Z, 'Count': Znew}
        df = pd.DataFrame(dict, index=np.arange(base, base+len(Z)))
        df.to_csv(outfile, index=True)
        base += len(Z)

Python export to .csv without overwriting columns in for loop

There are 1 best solutions below

Related Questions in PYTHON-3.X

Related Questions in DATAFRAME

Related Questions in CSV

Related Questions in FOR-LOOP

Related Questions in CSVWRITER

Trending Questions

Popular # Hahtags

Popular Questions