I am trying to generate a valid utf-8 character string using python 2.
def validate_mychar(cd_rng):
return unichr(cd_rng)
def get_utf8_char():
while True:
cd_rng = random.randint(0x100, 0xFFFF)
if validate_mychar(cd_rng):
return unichr(cd_rng)
def utf8_gen(length):
return u''.join(get_utf8_char() for i in xrange(length))
print(utf8_gen(10000))
But I wonder why I am getting strange 'invalid character' while using these characters. Can anybody help with a working code?