I have expensive jobs that are very suited to be run under map-and-reduce model (long story short, it is to aggregate a few hundred rankings that are previously calculated via some time-consuming algorithm).
I wanted to parallelize the jobs on clusters (not merely multiprocessing), and focused on 2 implementations: Celery and Disco. Celery does not support naive map-and-reduce out of the box, and although the "map" part is easily done using TaskSets, how do you implement the "reduce" part efficiently?
(My problem with disco is that it does not run on Windows, and I have already setup celery for another part of the program, so running another framework for map-reduce seems to be rather inelegant.)
Please take a look at the following blog.
http://mikecvet.wordpress.com/2010/07/02/parallel-mapreduce-in-python/