I have a design problem. I have an application which subscribes to a realtime system to show data. essentially what happens is the client connects to a server, downloads a snapshot of the data at the current time, and subscribes for live updates, which are immediately displayed on the UI.
One problem we have is that we can open multiple realtime reports which means that we have multiple connections and duplications of data which are not necessary. So we want to make a central data repository to hold all of the data and serve it to the reports, so that we only use 1 socket connection and one set of data crosses the wire.
The problem I have is this. When a report subscribes to my data repository, it retrieves the snapshot at the present time, and then receives live updates afterward. That means my repository is updating it's internal cache with the live updates from the server, and sending those updates to subscribed reports.
when another report connects to the repository, it needs to also download the current data and subscribe to updates. However, if updates come in while the snapshot is being downloaded, they will be missed by the report. I also can't lock the cache while the snapshot is being downloaded, because that would cause report 1 to stop updating while report 2 gets its snapshot.
how can i ensure that report 1 continues to get its updates, while report 2 downloads an unmolested snapshot and then begins to receive all the updates that it missed in the meantime as well as future updates?
Sorry if this isn't clear. I am not always good at describing my problem :) The data that comes in is essentially rows in a table which i then summarize into a tree. they can be identified by key fields in the "row", and my cache would store the latest copies of each "row"
Thanks in advance!
If I understand you correct you have 3 part of your system:
right?
If so, if I were you, I would develop a manager for the cache server and make 2 API for realtime system and clients that they would use to work with cache server. I would stay at the rule one to write at one time no one to read or all to read no one to write. I would make queues. One for clients requests and one for realtime. And we need the synchronization mechanism for that queues.
I see the next way for work:
if realtime system write new information:
there are client readers for reports that are updating right now
1.1 Cache manager write all info to the second store for these reports
If manager see that there is new info, it stop new readers requests and put them in the queue and wait until all threads that had already started for reading to be finished and make an updated from the second reposytory to the first
No readers
2.1. Put info to the main store block readers on the reports that are modified
If your realtime system is realy real time(works on a realtime processor) and write everytime you should add timouts for merging two stores and to stop readers for that time.