generate only valid utf-8 character string python 2

278 Views Asked by At

I am trying to generate a valid utf-8 character string using python 2.

def validate_mychar(cd_rng):
    return unichr(cd_rng)

def get_utf8_char():
    while True:
        cd_rng = random.randint(0x100, 0xFFFF)
        if validate_mychar(cd_rng):
            return unichr(cd_rng)

def utf8_gen(length):
    return u''.join(get_utf8_char() for i in xrange(length))


print(utf8_gen(10000))

But I wonder why I am getting strange 'invalid character' while using these characters. Can anybody help with a working code?

0

There are 0 best solutions below