I'm using StreamInsight 2.1 and running into unexpected performance problems.
I have one input adapter of Financial Data coming in with anywhere from 5,000 to 10,000 events per second. I then have a large number of queries operating against that input. Each query hooks up to the exact same passthrough query, so I have 1000 queries using the exact same input data.
To test that the system would be able to handle this, I created 1000 queries that did nothing but passthrough (from d in fullStream select d) the events to an output adapter which only Releases the event.
When I run 1,000 queries this way, the system cannot keep up with the stream. It falls farther and farther behind. If I trim it to 100 queries, the system keeps up perfectly.
Have I simply run into the performance wall with StreamInsight? Is it not able to handle the type of solution I am building? Or am I doing something stupid here.... Any help would be great, not sure what else to try to make it faster. I need it to be able to execute way more than 1000 queries and I need to run way more complicated queries than this.
I think you maybe having performance issues because of your current approach.
First off, let's cover the differences between the editions of StreamInsight. Standard edition has only 1 scheduler thread while Premium has one per core. The Evaluation edition is equivalent to Premium.
I think the way to fix this is to reduce the number of queries you have. If you are creating 1000 queries (each with their own instance of an output adapter) I can see where you are going to have issues. On a quad-core machine, you are going to have 4 scheduler threads trying to run 1000 queries.
Are your queries that are arranged "horizontally" doing the same thing? If so, see if you can consolidate them. For instance, if I needed to do a query like the "Price>5 Vol<2k" for 5 different stocks, I would write it in such a way that I can handle all 5 stocks in a standing query that sends all the results to 1 output adapter. If a client is "subscribing" to results from a query, that's something that can/should be handled by your output adapter. You could also turn results on and off for certain stocks by streaming in reference data.
Take a look at the sample below. "sourceStream" is going to be my raw stock data coming from the data source. "referenceStream" is going to be some configuration streamed in from a reference data source (i.e. SQL). The success or failure of the join will throttle the events that get passed on for further processing.