How to convert text like \u041b\u044e\u0431\u0438 to normal text while data download?

1.8k Views Asked by At

When I bulk download my GAE data written in Russian, I get the text like

u'\u041b\u044e\u0431\u0438\u043c\u0430\u044f \u0430\u043a\u0446\u0438\u044f \u0432\u0435\u0440\u043d\u0443\u043b\u0430\u0441\u044c! \u0412 \u0440\u0435\u0441\u0442\u043e\u0440\u0430\u043d\u0430\u0445 \u0415\u0432\u0440\u0430\u0437\u0438\u044f ""3 \u0440\u043e\u043b\u043b\u0430 \u043f\u043e \u0446\u0435\u043d\u0435 1""! \u0421 9 \u043f\u043e 12 \u0441\u0435\u043d\u0442\u044f\u0431\u0440\u044f! \u0422\u043e\u043b\u044c\u043a\u043e \u044d\u0442\u0438 4 \u0434\u043d\u044f! \u041f\u043e\u0434\u0440\u043e\u0431\u043d\u043e\u0441\u0442\u0438 \u043d\u0430 evrasia.spb.ru, 88005050145 \u0438 008'

The following bulkloader is used:

transformers:
- kind: MyKind
  connector: csv
  connector_options:
  property_map:
    - property: texts
      external_name: texts

What should I do to get it already decoded?

Upd. I've tried to do the following

python_preamble:
- import: codecs
...
    - property: texts
      external_name: texts
      export_transform: codecs.decode('unicode_escape')

but getting the error:

Unable to assign value 'codecs.decode('unicode_escape')' to attribute 'export_transform':
Code for export_transform did not return a callable.  Code: "codecs.decode('unicode_escape')".
  in "bulkloader.yaml", line 22, column 25

Somehow bulkloader documentation got removed from the Google site, so I don't know where to read about export_transform usage.

1

There are 1 best solutions below

0
On

Without knowing anything about GAE and its workings I got some thoughts on this that may or may not help you forward:

  • If you try to print the string (ex: print the_string) and its written as the thing in your question you could use eval (ex: print eval(the_string)). Or if you just want to make it into a unicode object, use: the_string=eval(the_string).
  • By looking at your errormessage "... export_transform did not return a callable ..." and the actual name "export_transform" I would guess export_transform needs to be a callable transformation-function. Try to define one externally or use a lambda-function.

Hope this helps you...