How to convert languages ISO639-1 codes to language names in python?

5.3k Views Asked by At

I have the following Pandas series:

>>> df.original_language.value_counts()
en    32269
fr     2438
it     1529
ja     1350
de     1080
      ...  
la        1
jv        1
sm        1
gl        1
mt        1
Name: original_language, Length: 92, dtype: int64
4

I want to convert these language codes into their original names, for example

en >> English

ar >> Arabic

I looked up this question but it didn't help. If there are any packages required, please provide a source of how to install them using pip if possible.

1

There are 1 best solutions below

2
On BEST ANSWER

Use iso-639 module ->

#pip install iso-639
from iso639 import languages
df['lang'] = df['lang'].apply(lambda x: languages.get(alpha2=x).name)

output -

       lang  count
0   English  32269
1    French   2438
2   Italian   1529
3  Japanese   1350
4    German   1080
5     Latin      1
6  Javanese      1
7    Samoan      1
8  Galician      1
9   Maltese      1

If you wanna convert codes in your original df, then use -

from iso639 import languages
df['original_language'] = df['original_language'].apply(lambda x: languages.get(alpha2=x).name)