Table Design for Calculating Median Over User-Defined Period from Pre-Processed Data

64 Views Asked by At

I’m working with a table that contains pre-processed data from several other tables. This data is segmented by a time period (like a day) to speed up queries that could take minutes if I had to process everything each time a user wants to see a summary of that day or multiple days. Most fields are “simple” accumulations of calculated data, such as the accumulated timespans of multiple events.

Now, I need to present the median value of these events over a user-defined time period. The time periods are sets of multiple rows from my pre-processed data. I can calculate the median for the specific day that was pre-processed, but if the user needs the median from various days, I need the original calculated values. My current approach is to store it in an array of the values to be able to use them when needed.

I can’t simply go over the original tables and recalculate the timespans for the period, as the original table of this specific data has many rows and the timespans need to be calculated with a specific set of rules (this is one of the reasons to have it pre-processed in the first place).

Creating a new table to store these values seems counterproductive, as I would now have multiple rows of seemingly “duplicated” data just so I can get the median.

I understand that it’s a bad practice to use one field to store multiple values and wouldn’t advocate for it, but in this particular case, all the options seem to return worse results. Any suggestions on how I can solve this problem?

0

There are 0 best solutions below