How to convert JER formatted JSON file to UPER using asn1tools?

315 Views Asked by At

I am writing a a python script that takes a JSON file encoded in JER and convert it to UPER but I couldn't find a direct way to do this using the asn1tools.

ASN File: schema.asn

Schema DEFINITIONS ::= BEGIN

    User ::= SEQUENCE {
        firstName IA5String,
        lastName  IA5String,
        id        ID
    }

    ID ::= CHOICE {
        userName  IA5String,
        userEmail IA5String
    }

END

JSON File: user.json

{
  "firstName": "John",
  "lastName": "Doe",
  "id": ["userName", "johndoe"]
}

Python File: script.py

import json
import asn1tools

schema = asn1tools.compile_files('schema.asn', codec='uper')

with open('user.json') as jer:
    schema.encode('User', json.load(jer))

I am getting the following error:

Traceback (most recent call last):
  File "/home/bijesh/playground/asn1_decoder/temp/script.py", line 7, in <module>
    schema.encode('User', json.load(jer))
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/compiler.py", line 137, in encode
    type_.check_types(data)
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/compiler.py", line 102, in check_types
    return self.type_checker.encode(data)
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/type_checker.py", line 311, in encode
    raise e
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/type_checker.py", line 307, in encode
    self._type.encode(data)
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/type_checker.py", line 142, in encode
    self.encode_members(data)
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/type_checker.py", line 154, in encode_members
    raise e
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/type_checker.py", line 150, in encode_members
    member.encode(data[name])
  File "/home/bijesh/anaconda3/lib/python3.9/site-packages/asn1tools/codecs/type_checker.py", line 224, in encode
    raise EncodeError(
asn1tools.codecs.EncodeError: User.id: Expected data of type tuple(str, object), but got ['userName', 'johndoe'].
1

There are 1 best solutions below

7
bazza On

This script first

  • Generates some JER encoded data to work with (more later)
  • Decodes the JER back to a Python object
  • Re-encodes that Python object as uPER

The important thing is to ensure that the object that you're trying to serialise in uPER is of the type expected by the compiled schema. You can't just poke any old object into the encoder and expect it to work. What this script shows is the object type that the encoder is expecting. json.load(jer) is not producing an object of that type, which is why you're getting an error. That's why I'm focusing on whether your user.json file really is from an ASN.1 JER encoder, because it doesn't look like it is. It maybe perfectly valid JSON of itself, but it's not the style of JSON that JER defines. One option (see below, a fair way down) to have a script convert it to a JSON representation that the compiled schema can understand.

BTW, relying on decoding from valid JER JSON, and re-encoding as uPER is the recommended way, in case your real scheme (is this an example?) has constraints in it. If you use the compiled schema to decode the JER JSON, the decoder should check and enforce the data against the constraints in the schema. Using json.load won't do that, meaning invalid data might slip through.

Script:

import json
import asn1tools

schemauper = asn1tools.compile_files('schema.asn', codec='uper')
schemajer = asn1tools.compile_files('schema.asn', codec='jer')

#Write some JER encoded data - I couldn't make it read your JER file
JERWireData = schemajer.encode("User", { 'firstName': "John", 'lastName': "Doe", 'id': ("userName", "johndoe") } )
print(JERWireData)

#Decode the JER data
userObject = schemajer.decode("User", JERWireData)

#userObject is an object
print(userObject)

#re-encode the object as uPER
uPERWireData = schemauper.encode("User", userObject)
file = open("output.uper","wb")
for byte in uPERWireData:
    file.write(byte.to_bytes(1, byteorder='big'))

Gives me the following output:

b'{"firstName":"John","lastName":"Doe","id":{"userName":"johndoe"}}'
{'firstName': 'John', 'lastName': 'Doe', 'id': ('userName', 'johndoe')}

and output.uper contains

04 95 bf 46 e0 38 9b f2 81 f5 6f d1 bb 26 fc a0

Why is ASN1Tool's JSON Different to Your JSON?

Tidying up the first line of the output to compare the JSON generated by asn1tools with your json file:

{
    "firstName":"John",
    "lastName":"Doe",
    "id":
    {
        "userName":"johndoe"
    }
}

The JER encoding for id is very different in your JSON file:

"id": ["userName", "johndoe"]

There are two possibilities for how JER can encode a CHOICE, depending on whether the schema has encoding instructions in it. From the standards doc Section 31, the difference is down to whether UNWRAPPED encoding is in force. Without UNWRAPPED, clause 31.3 applies:

31.3 Wrapped encoding

31.3.1

The encoding of a value of a choice type not having a final UNWRAPPED encoding instruction shall be a JSON object having exactly one member. Each member of a JSON object has a name, which is a JSON string (see ECMA-404, clause 6).

31.3.2

The only member of the JSON object shall be as follows:

a) if the type of the chosen alternative has a final NAME encoding instruction, the Unicode character string denoted by the name of the member of the JSON object shall be the name produced by the instruction; otherwise, the Unicode character string denoted by the name of the member shall be the identifier of the chosen alternative;

b) the value of the member shall be the JER encoding of the value of the chosen alternative.

NOTE – The use of quotation marks around the identifier is required.

Make of that what you can! Clause 31.2 is illuminating, because it talks about being able to omit { and }, but does not refer to the use of [ and ] instead. The JER encoding rules have evolved since first considered; it's possible that, if your JER JSON data originates from elsewhere, they're using a previous version of the JER standard. I've not checked whether previous versions use [ and ].

None the less, decoding your JSON file another way:

userJsonFile = open("user.json","rb")
userJson = userJsonFile.read()
print(userJson)
userObject = schemajer.decode("User", userJson)

results in

Traceback (most recent call last):
  File "script.py", line 28, in <module>
    userObject = schemajer.decode("User", userJson)
  File "/<snip>asn1tools/compiler.py", line 167, in decode
    decoded = type_.decode(data)
  File "/<snip>asn1tools/codecs/jer.py", line 523, in decode
    return self._type.decode(json.loads(data.decode('utf-8')))
  File "/<snip>asn1tools/codecs/jer.py", line 86, in decode
    value = member.decode(data[name])
  File "/<snip>asn1tools/codecs/jer.py", line 361, in decode
    name, value = list(data.items())[0]
AttributeError: 'list' object has no attribute 'items'

Regardless, given that the only difference is whether the id's value is wrapped in square or curly braces, the simplest solution is likely to be to read the data from file, search / replace [ for {, same for the closing braces, and the , to a :, and then JER decode it and uPER encode it

import json
import asn1tools
schemajer = asn1tools.compile_files('schema.asn', codec='jer')
userJsonFile = open("user.json","rb")

#Alter the JSON to fit the JER encoding standard
userJsonStr = userJsonFile.read().decode("utf8")
userJsonStr = userJsonStr.replace("[","{")
userJsonStr = userJsonStr.replace("]","}")
userJsonStr = userJsonStr.replace("userName\",","userName\":")
userJson = userJsonStr.encode("utf8")

#Decode the JSON
userObject = schemajer.decode("User", userJson)

#Encode the object as uPER
schemauper = asn1tools.compile_files('schema.asn', codec='uper')
uPERWireData = schemauper.encode("User", userObject)
file = open("output.uper","wb")
for byte in uPERWireData:
    file.write(byte.to_bytes(1, byteorder='big'))

ASN1 Playground

Using the ASN.1 Playground at asn.io, giving it the schema:

Schema DEFINITIONS AUTOMATIC TAGS ::= BEGIN

    User ::= SEQUENCE {
        firstName IA5String,
        lastName  IA5String,
        id        ID
    }

    ID ::= CHOICE {
        userName  IA5String,
        userEmail IA5String
    }

END

and the ASN1 value (in asn1 value notation)

value User ::= {
  firstName "John",
  lastName "Doe",
  id userName:"johndoe"
}

produces the same uPER output:

04 95 bf 46 e0 38 9b f2 81 f5 6f d1 bb 26 fc a0

So, my script and the ASN.1 playground are producing the same uPER output.