How do i decode this string? \xc3\x99\xc3\xa9\xc2\x87-B[x\xc2

12.3k Views Asked by Niche At 07 June 2016 at 04:27

This is what I need to decode

\xc3\x99\xc3\x99\xc3\xa9\xc2\x87-B[x\xc2\x99\xc2\xbe\xc3\xa6\x14Ez\xc2\xab

it is generated by String.fromCharCode(arrayPw[i]); but i don't understand how to decode it :(

Please help

Original Q&A

There are 2 best solutions below

gotnull On 07 June 2016 at 04:35

Python:

data = "\xc3\x99\xc3\x99\xc3\xa9\xc2\x87-B[x\xc2\x99\xc2\xbe\xc3\xa6\x14Ez\xc2\xab"
udata = data.decode("utf-8")
asciidata = udata.encode("ascii","ignore")

JavaScript:

function decode_utf8(s) {
  return decodeURIComponent(escape(s));
}

Otherwise do more research about decoding UTF-8.

https://gist.github.com/chrisveness/bcb00eb717e6382c5608

There's also an online UTF-8 decoder/encoder:

https://mothereff.in/utf-8

HINT: ÙÙé-B[x¾æEz«

Nicoolasens On 22 January 2022 at 17:12

duplicate of this : https://stackoverflow.com/a/70815136/5902698

You load a dataset and you have some strange characters. Exemple :

'æˆ´æ£®ç¾Žå�‘é€\xa0åž‹å™¨å®Œæ•´ç‰ˆå¥—è£…Dyson Airwrap HS01ï¼ˆé“œé‡‘è‰²ç¤¼ç›’ç‰ˆï¼‰'

In my case, I know that the strange characters are chineses. So I can figure that the one who send me the data have encode it in utf-8 but should do it in 'ISO-8859-1'.

So first step, I had encoded the string, then I decode with utf-8. so my lines are :

_encoding = 'ISO-8859-1'
_my_str.encode(_encoding, 'ignore').decode("utf-8", 'ignore')

Then my output is :

"'森Dyson Airwrap HS01礼'"

This works for me, but I guess that I do not really well understood under the hood. So feel free to tell me if you have further information.

Bonus. I'll try to detect when the str is in the first strange format because some of my entries are in chinese but others are in english

EDIT : The Bonus is useless. I Just use lamba on ma column to encode and decode without care about format. So I changed the encoding after loading the dataframe

_encoding = 'ISO-8859-1'
_decoding = "utf-8"
df[col] = df[col].apply(lambda x : x.encode(_encoding, 'ignore').decode(_decoding , 'ignore'))

How do i decode this string? \xc3\x99\xc3\xa9\xc2\x87-B[x\xc2

There are 2 best solutions below

Related Questions in ENCRYPTION

Related Questions in REDIS

Related Questions in ASCII

Related Questions in DECODE

Related Questions in UTF8-DECODE

Trending Questions

Popular # Hahtags

Popular Questions