re.findall does not find some dots

42 Views Asked by At

I have a file with preprocessed german text all 39lines have a dot at the end

In order to get rid of some nulls in text I use this code:

text_with_nulls = open('lemmatizeAFD', 'r')
text_without_nulls = open('lemmatizeAFD_without_nulls', 'w')
for i in text_with_nulls:
    res = re.findall(r'[a-zA-Z0-9äöüÄÖÜß\.]+', i)
    for i in res:
        text_without_nulls.write(i)
        text_without_nulls.write(' ')

the output file, however, has only 33 dots, but all lemmatized words are in placed

enter image description here

Why did some dots disappear? I need them to split sentenses in separate lines later.

I am not very pofessional in re, so I assume that something is wrong with res = re.findall(r'[a-zA-Z0-9äöüÄÖÜß\.]+', i)

0

There are 0 best solutions below