Efficient way of reading word count from a file and calculating average per sentence

2.7k Views Asked by At

I need to write a python code that reads the contents of a text file(file.txt) and calculate the average number of words per sentence.(Assuming the file contains a number of sentences only one per line.)

i did the coding and i need to know whether it can be more efficient in another way. Million thanks in advance. here is mine :

# This program reads contents of a .txt file and calulate
# the average number of words per sentence .

line_count=0
# open the file.txt for reading
content_file=open('file.txt','r')

# calculate the word count of the file
content=content_file.read()

words= content.split()

word_count=len(words)

# calculate the line count
for line in open('file.txt'):

    line_count+=1

content_file.close()

# calculate the average words per line

average_words=word_count/line_count

# Display the result

print('The average word count per sentence is', int(average_words))
3

There are 3 best solutions below

0
On BEST ANSWER

No need to iterate the file twice. Just update the counts while you go through the lines::

lc, wc = 0, 0
with open('file.txt','r') as f:
    for line in f:
        lc += 1
        wc += len(line.strip().split())

avg = wc / lc
0
On

My suggestion is, instead of using for loop split the content by '\n' and find the length of the array.

open the file.txt for reading

content_file=open('file.txt','r')

calculate the word count of the file

content=content_file.read()

word_count=len(content.split())

line_count= len(content.split('\n'))

content_file.close()

calculate the average words per line

average_words=word_count/line_count

Display the result

print('The average word count per sentence is', int(average_words))

0
On

Following code will be efficient since we read file contents once at a time.

with open(r'C:\Users\lg49242\Desktop\file.txt','r') as content:
    lineCount = 0
    Tot_wordCount = 0
    lines = content.readlines()
    for line in lines:
        lineCount = lineCount + 1       
        wordCount = len(line.split())
        Tot_wordCount += wordCount

avg = Tot_wordCount/lineCount

print avg