How to export python list to excel file?

39 Views Asked by At

I am trying to export a list of Reddit comments I web scrapped to an Excel file. The file is created but gives me this error in Excel:

"Excel cannot open the file 'test1.xlsx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension mates the format of the file."

My code so far:

url = "https://www.reddit.com/r/bayarea/comments/17v80j5/do_you_see_yourself_using_the_ca_high_speed_rail/"

submission = reddit.submission(url=url)

posts = []

for top_level_comment in submission.comments:
    if isinstance(top_level_comment, MoreComments):
        continue
    posts.append(top_level_comment.body)

posts = pd.DataFrame(posts,columns=["body"])

indexNames = posts[(posts.body == '[removed]') | (posts.body == '[deleted]')].index

posts.drop(indexNames, inplace=True)
print(posts)


writer = pd.ExcelWriter('test1.xlsx', engine = 'openpyxl')
posts.to_excel(writer, sheet_name='data', index = False)

Should i use a different library to export to excel?

2

There are 2 best solutions below

0
TuoMin On

because i didn't use reddit scrapped before, so i just try the dataframe save code. here is my try:

import pandas as pd
data = {'1': [1, 2, 3, 4, 5],
        '2': ['A', 'B', 'C', 'D', 'E'],
        '3': [0.1, 0.2, 0.3, 0.4, 0.5]}
posts = pd.DataFrame(data)
print(posts)
writer = pd.ExcelWriter('test1.xlsx', engine = 'openpyxl')
# posts.to_excel('test1.xlsx', sheet_name='data', index = False)
posts.to_excel(writer, sheet_name='data', index = False)
writer.save()

or you can just use posts.to_excel('test1.xlsx', sheet_name='data', index = False) to inplace

writer = pd.ExcelWriter('test1.xlsx', engine = 'openpyxl')
posts.to_excel(writer, sheet_name='data', index = False)
writer.save()

by reading your code, I don't think there is a problem caused by appending Excel data, as you only use to_excel once at the end

that's all. wish to help u.

0
SpaciousCoder78 On

Try the library named XlsxWriter.