using python to count unique values and scale by frequency

759 Views Asked by At

I've got myself a dataset that looks like this:

[
    {'A':'0'},
    {'B':'0'}, 
    {'C':'1'}
]

I'd like to transmogrify it into a dataset that looks like this:

[
    {'0':'2'},
    {'1':'1'}
]

Essentially the task is counting the values,

and for each unique value

creating a new entry in a data-structure

for each one of those unique entries (once again, based on the values)

to increment the corresponding entry,

Basically the task is tallying up all of the times we've seen unique values and magnifying that by the number of times the value has been expressed.

What's the most efficient, effective way to do that in python?

I've been experimenting with counter but thus far without much success, as my base data structure seems to be incompatible, the codebase looks like this:

dict_hash_gas = list()
for line in inpt:
    resource = json.loads(line)
    dict_hash_gas.append({resource['first']:resource['second']})

and the dataset like this:

{"first":"A","second":"0","third":"2"} 
{"first":"B","second":"0","third":"2"} 
{"first":"C","second":"1","third":"2"} 
2

There are 2 best solutions below

7
On BEST ANSWER

You can use a Counter quite easily:

>>> data = [
...     {'A':'0'},
...     {'B':'0'},
...     {'C':'1'}
... ]
>>> import collections
>>> counts = collections.Counter(v for d in data for v in d.values())
>>> counts
Counter({'0': 2, '1': 1})

Now, to get the final list you want, simply:

>>> [{k:v} for k,v in counts.items()]
[{'0': 2}, {'1': 1}]

Although, I don't know why you would want such a list, I can only presume some REST-based API is expecting some JSON in that format...

2
On
result = dict()

for name, value in input.items():
    result.update({value: result.get(value, 0) + 1})