I have the following table, about 3m rows. I want to know which approach/attributes I can use the squeeze maximum performance out of it. I will be selecting only on exact matches for inst + time.
price_hist_5min_bars:([inst:`$();time:`timestamp$()]price:`float$())
Also, does the approach change if the table is for reporting only or is changing? This table won't have any rows added to it on-the-fly but I will have another intra-day table with the same structure which will have rows added every 1 or 5 mins. All suggestions appreciated!
I have tested with a sample database and I can get a slight improvement if I the use
`p#
attribute:In this case
`p#
has advantages over`g#
as the symbols are in order and not dispersed. See the wiki entry on attributes for more information.Better performance can be achieved if you restructure the table. Using the following table as an example:
We can set a benchmark using
select
:You can restructure by grouping all of the time and price values for each sym, sorting on time:
Working with this restructured data, we can return the price with:
Where the index of the record within the sym grouping is given by:
Obviously the above example only returns and atom and not a table. If you wished to have a table you could try something like this:
But it depends on how you want your output to look.