How can I calculate the average of a list of tuples in python?

6.3k Views Asked by At

I have a list of tuples in the format:

[(security, price paid, number of shares purchased)....]

[('MSFT', '$39.458', '1,000'), ('AAPL', '$638.416', '200'), ('FOSL', '$52.033', '1,000'), ('OCZ', '$5.26', '34,480'), ('OCZ', '$5.1571', '5,300')]

I want to consolidate the data. Such that each security is only listed once.

[(Name of Security, Average Price Paid, Number of shares owned), ...]

4

There are 4 best solutions below

0
On BEST ANSWER

I used a dictionary as Output.

lis=[('MSFT', '$39.458', '1,000'), ('AAPL', '$638.416', '200'), ('FOSL', '$52.033', '1,000'), ('OCZ', '$5.26', '34,480'), ('OCZ', '$5.1571', '5,300')]

dic={}
for x in lis:
    if x[0] not in dic:
     price=float(x[1].strip('$'))
     nos=int("".join(x[2].split(',')))
     #print(nos)
     dic[x[0]]=[price,nos]
    else:
     price=float(x[1].strip('$'))
     nos=int("".join(x[2].split(',')))
     dic[x[0]][1]+=nos
     dic[x[0]][0]=(dic[x[0]][0]+price)/2
print(dic)    

output:

{'AAPL': [638.416, 200], 'OCZ': [5.20855, 39780], 'FOSL': [52.033, 1000], 'MSFT': [39.458, 1000]}
0
On
>>> lis
[('MSFT', '$39.458', '1,000'), ('AAPL', '$638.416', '200'), ('FOSL', '$52.033', '1,000'), ('OCZ', '$5.26', '34,480'), ('OCZ', '$5.1571', '5,300')]
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for i in lis:
...    amt = float(i[1].strip('$'))
...    num = int(i[2].replace(",", ""))
...    d[i[0]].append((amt,num))
... 
>>> for i in d.iteritems():
...   average_price = sum([s[0] for s in i[1]])/len([s[0] for s in i[1]])
...   total_shares = sum([s[1] for s in i[1]])
...   print (i[0],average_price,total_shares)
... 
('AAPL', 638.416, 200)
('OCZ', 5.20855, 39780)
('FOSL', 52.033, 1000)
('MSFT', 39.458, 1000)
0
On

Here you go:

the_list = [('msft', '$31', 5), ('msft','$32', 10), ('aapl', '$100', 1)]
clean_list = map (lambda x: (x[0],float (x[1][1:]), int(x[2])), the_list)
out = {}

for name, price, shares in clean_list:
    if not name in out:
        out[name] = [price, shares]
    else:
        out[name][0] += price * shares
        out[name][1] += shares

# put the output in the requested format
# not forgetting to calculate avg price paid
# out contains total # shares and total price paid

nice_out = [ (name, "$%0.2f" % (out[name][0] / out[name][1]), out[name][1])
              for name in out.keys()]

print nice_out
>>> [('aapl', '$100.00', 1), ('msft', '$23.40', 15)]
4
On

It's not very clear what you're trying to do. Some example code would help, along with some information of what you've tried. Even if your approach is dead wrong, it'll give us a vague idea of what you're aiming for.

In the meantime, perhaps numpy's numpy.mean function is appropriate for your problem? I would suggest transforming your list of tuples into a numpy array and then applying the mean function on a slice of said array.

That said, it does work on any list-like data structure and you can specify along which access you would like to perform the average.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html

EDIT:

From what I've gathered, your list of tuples organizes data in the following manner:

(name, dollar ammount, weight)

I'd start by using numpy to transform your list of tuples into an array. From there, find the unique values in the first column (the names):

import numpy as np
a = np.array([(tag, 23.00, 5), (tag2, 25.00, 10)])
unique_tags = np.unique(a[0,:])  # note the slicing of the array

Now calculate the mean for each tag

meandic = {}
for element in unique_tags:
   tags = np.nonzero(a[0,:] == element)  # identify which lines are tagged with element
   meandic[element] = np.mean([t(1) * t(2) for t in a[tags]])

Please note that this code is untested. I may have gotten small details wrong. If you can't figure something out, just leave a comment and I'll gladly correct my mistake. You'll have to remove '$' and convert strings to floats where necessary.