urdu strings looking same but in comparison found unequal python3

389 Views Asked by Naila Akbar At 01 July 2025 at 22:32

In my application, I've list of (Urdu) words in text file, (currently single word like this)

and I've another text file having string of urdu (currently single word like this and exactly same)

Now I need to find if string file's string has any word that exists in word's file. For this, I'm reading both file into lists like this;

// reading text file of strings...

fileToRead = codecs.open('string.txt', mode, encoding=encoding)
fileData = fileToRead.read()
lstFileData = fileData.split('\n')


wordListToRead = codecs.open('words.txt', mode, encoding=encoding)
wordData = wordListToRead.read()
lstWords = wordData.split('\n')

I'm simply traversing list like this;

for string in lstFileData:
    if string in lstWords:
        // do further work

and its not working And I don't know Why? Although string is 'فلسفے' and lstWords has this string in it. Do I need to add some encoding? Any kind of help will be appreciated.

Original Q&A

There are 2 best solutions below

Naila Akbar On 07 October 2018 at 12:52 BEST ANSWER

May be it helped out someone like me

Although it sounds like fun but Issue was in file encoding type. I opened up file in simple notepad to make some changes and saved it. It changed my file from utf-8 to utf-8 BOM. And my code wasn't working on it. Once I created new file in notepad++ in utf-8, Same code started working fine. (Because issue was not in code, it was in file encoding)

golddove On 06 October 2018 at 14:40

Just tried it out in python3 and it seems to work for me:

lstWords = ['a', 'فلسفے', 'b']
string = 'فلسفے'
if string in lstWords:
    print("yes")

Edit: Again, just tested your updated code with file IO and it works fine (I did not specify an encoding). Here is a link of it working: https://trinket.io/python3/3890d8b261

urdu strings looking same but in comparison found unequal python3

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in UNICODE

Related Questions in UTF-8

Related Questions in URDU

Trending Questions

Popular # Hahtags

Popular Questions