Parsing JSON failed

478 Views Asked by At

I am trying to parse this data (from the Viper malware analysis framework API specifically). I am having a hard time figure out the best way to do this. Ideally, I would just do a:

jsonObject.get("SSdeep")

... and I would get the value.

I don't think this is valid JSON unfortunately, and without editing the source of the project, how can I make this proper JSON or easily get these values?

[{
'data': {
    'header': ['Key', 'Value'],
    'rows': [
        ['Name', u 'splwow64.exe'],
        ['Tags', ''],
        ['Path', '/home/ubuntu/viper-master/projects/../binaries/8/e/e/5/8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781'],
        ['Size', 125952],
        ['Type', 'PE32+ executable (GUI) x86-64, for MS Windows'],
        ['Mime', 'application/x-dosexec'],
        ['MD5', '4b1d2cba1367a7b99d51b1295b3a1d57'],
        ['SHA1', 'caf8382df0dcb6e9fb51a5e277685b540632bf18'],
        ['SHA256', '8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781'],
        ['SHA512', '709ca98bfc0379648bd686148853116cabc0b13d89492c8a0fa2596e50f7e4d384e5c359081a90f893d8d250cfa537193cbaa1c53186f29c0b6dedeb50d53d4d'],
        ['SSdeep', ''],
        ['CRC32', '7106095E']
    ]
},
'type': 'table'
}]

Edit 1 Thank you! So I have tried this:

        jsonObject = r.content.replace("'", "\"")
        jsonObject = jsonObject.replace(" u", "")

and the output I have now is:

"[{"data": {"header": ["Key", "Value"], "rows": [["Name","splwow64.exe"], ["Tags", ""], ["Path", "/home/ubuntu/viper-master/projects/../binaries/8/e/e/5/8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781"], ["Size", 125952], ["Type", "PE32+ executable (GUI) x86-64, for MS Windows"], ["Mime", "application/x-dosexec"], ["MD5", "4b1d2cba1367a7b99d51b1295b3a1d57"], ["SHA1", "caf8382df0dcb6e9fb51a5e277685b540632bf18"], ["SHA256", "8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781"], ["SHA512", "709ca98bfc0379648bd686148853116cabc0b13d89492c8a0fa2596e50f7e4d384e5c359081a90f893d8d250cfa537193cbaa1c53186f29c0b6dedeb50d53d4d"], ["SSdeep", ""], ["CRC32", "7106095E"]]}, "type": "table"}]"

and now I'm getting this error:

  File "/usr/lib/python2.7/json/decoder.py", line 369, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 5 - line 1 column 716 (char 4 - 715)

Note: I'd really rather not do the find and replaces like that.. especially the " u" one, as this could have unintended consequences.

Edit 2: Figured it out! Thank you everyone!

Here's what I ended up doing, as someone mentioned the original text from the server was a "list of dicts":

        r = requests.post(url, data=data) #Make the server request
        listObject = r.content #Grab the content (don't really need this line)
        listObject = listObject[1:-1] #Get rid of the quotes 
        listObject = ast.literal_eval(listObject) #Create a list out of the literal characters of the string
        dictObject = listObject[0] #My dict! 
3

There are 3 best solutions below

1
On BEST ANSWER

JSON specifies double quotes "s for strings, from the JSON standard

A value can be a string in double quotes, or a number, or true or false or null, or an object or an array.

So you would need to replace all the single quotes with double quotes:

data.replace("'", '"')

There is also a spurious u in the Name field that will need to be removed.
However if the data is valid Python and you trust it you could try evaluating it, this worked with your original data (without the space after the u):

result = eval(data)

Or more safely:

result = ast.literal_eval(data)
3
On

Now you appear to have quotes "wrapping" the entire thing. Which is causing all the brackets to be strings. Remove the quotes at the start and end of the JSON.

Also, in JSON, start the structure with either '[' or '{' (usually '{'), not both.

0
On

No need to use eval(), just replace the malformed characters (use escape \ character) and parse it with json will be fine:

resp = r.content.replace(" u \'", " \'").replace("\'", "\"")

json.loads(resp)