Consolidate tuple and render as histogram using Python

107 Views Asked by At

At the moment I have a list of tuples sorted by the second element in each tuple:

[('0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63',
  '121000'),
 ('0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26',
  '121000'),
 ('0x8b0fe2b7727664a14406e7377732caed94315b026b37577e2d9d258253067553',
  '21000'),
 ('0x0abe75e40a954d4d355e25e4498f3580e7d029769897d4187c323080a0be0fdd',
  '21000'),
 ('0x8adfe7fc3cf0eb34bb56c59fa3dc4fdd3ec3f3514c0100fef800f065219b7707',
  '40000'),
 ('0x244b29b60c696f4ab07c36342344fe6116890f8056b4abc9f734f7a197c93341',
  '50000'),
 ('0x22c2b6490900b21d67ca56066e127fa57c0af973b5d166ca1a4bf52fcb6cf81c',
  '90000'),
 ('0x8570106b0385caf729a17593326db1afe0d75e3f8c6daef25cd4a0499a873a6f',
  '90000')]

What I'd like to do is consolidate this set such that the second element in each tuple becomes the key, and the number of times it appears becomes the value, like this:

'90000':  2
'50000':  1
'40000':  1
'21000':  2
'121000': 2

Ultimately I'd like to render this as a histogram, but I'm not sure how to effect this consolidation operation and what data structure would be most suitable for subsequently generating a corresponding histogram.

1

There are 1 best solutions below

5
On BEST ANSWER

You need to parse it into a "flat list" of the second element of each tuple:

>>> my_list = [('0xf2b5b8fb173e371cbb427625b0339f6023f8b4ec3701b7a5c691fa9cef9daf63',
          '121000'),
         ('0xf8f2a397b0f7bb1ff212b6bcc57e4a56ce3e27eb9f5839fef3e193c0252fab26',
          '121000'),
         ('0x8b0fe2b7727664a14406e7377732caed94315b026b37577e2d9d258253067553',
          '21000'),
         ('0x0abe75e40a954d4d355e25e4498f3580e7d029769897d4187c323080a0be0fdd',
          '21000'),
         ('0x8adfe7fc3cf0eb34bb56c59fa3dc4fdd3ec3f3514c0100fef800f065219b7707',
          '40000'),
         ('0x244b29b60c696f4ab07c36342344fe6116890f8056b4abc9f734f7a197c93341',
          '50000'),
         ('0x22c2b6490900b21d67ca56066e127fa57c0af973b5d166ca1a4bf52fcb6cf81c',
          '90000'),
         ('0x8570106b0385caf729a17593326db1afe0d75e3f8c6daef25cd4a0499a873a6f',
          '90000')]
>>> flat_list = [x[1] for x in my_list]

Then you can use Counter to count each element appearance:

>>> from collections import Counter
>>> Counter(flat_list)
Counter({'121000': 2, '21000': 2, '90000': 2, '40000': 1, '50000': 1})

EDIT

As you wanted a threshold, you can add a condition to the list comprehension:

flat_list = [x[1] for x in my_list if int(x[1]) > 1000]

P.S

Counter is a dict subclass so you should be able to do anything you want just like dict, but you can also cast it with dict(counter_result)