I'm playing around with Storm, and I'm wondering where Storm specifies (if possible) the (tumbling/sliding) window size upon an aggregation. E.g. If we want to find the trending topics for the previous hour on Twitter. How do we specify that a bolt should return results for every hour? Is this done programatically inside each bolt? Or is it some way to specify a "window" ?
(Twitter) Storm's Window On Aggregation
4.6k Views Asked by gronnbeck At
2
Disclaimer: I wrote the Trending Topics with Storm article referenced by gakhov in his answer above.
I'd say the best practice is to use the so-called tick tuples in Storm 0.8+. With these you can configure your own spouts/bolts to be notified at certain time intervals (say, every ten seconds or every minute).
Here's a simple example that configures the component in question to receive tick tuples every ten seconds:
You can then use a conditional switch in your spout/bolt's
execute()
method to distinguish "normal" incoming tuples from the special tick tuples. For instance:Again, I wrote a pretty detailed blog post about doing this in Storm a few days ago as gakhov pointed out (shameless plug!).