I'm in the RoR stack and I had to write some actual SQL to complete this query for all records that are "open", meaning that the current time is within the specified hours of operation. In the hours_of_operations
table two integer
columns opens_on
and closes_on
store a weekday, and two time
fields opens_at
and closes_at
store the respective time of the day.
I made a query that compares the current date and time to the stored values but I'm wondering if there is a way to cast to some sort of date type and have PostgreSQL do the rest?
The meat of the query is:
WHERE (
(
/* Opens in Future */
(opens_on > 5 OR (opens_on = 5 AND opens_at::time > '2014-03-01 00:27:25.851655'))
AND (
(closes_on < opens_on AND closes_on > 5)
OR ((closes_on = opens_on)
AND (closes_at::time < opens_at::time AND closes_at::time > '2014-03-01 00:27:25.851655'))
OR ((closes_on = 5)
AND (closes_at::time > '2014-03-01 00:27:25.851655' AND closes_at::time < opens_at::time)))
OR
/* Opens in Past */
(opens_on < 5 OR (opens_on = 5 AND opens_at::time < '2014-03-01 00:27:25.851655'))
AND
(closes_on > 5)
OR
((closes_on = 5)
AND (closes_at::time > '2014-03-01 00:27:25.851655'))
OR (closes_on < opens_on)
OR ((closes_on = opens_on)
AND (closes_at::time < opens_at::time))
)
)
Th reason for such dense complexity is that an hour of operation may wrap around the end of the week, for example, starting at noon on Sunday and going through 6 AM Monday. Since I store values in UTC, there are many cases in which local time of the user could wrap in a very strange way. The query above ensures that you could enter ANY two times of the week and we compensate for the wrapping.
Table layout
Re-design the table to store opening hours (hours of operation) as a set of
tsrange
(range oftimestamp without time zone
) values. Requires Postgres 9.2 or later.Pick a random week to stage your opening hours. I like the week:
1996-01-01 (Monday) to 1996-01-07 (Sunday)
That's the most recent leap year where Jan 1st conveniently happens to be a Monday. But it can be any random week for this case. Just be consistent.
Install the additional module
btree_gist
first:See:
Then create the table like this:
The one column
hours
replaces all of your columns:For instance, hours of operation from Wednesday, 18:30 to Thursday, 05:00 UTC are entered as:
The exclusion constraint
hoo_no_overlap
prevents overlapping entries per shop. It is implemented with a GiST index, which also happens to support our queries. Consider the chapter "Index and Performance" below discussing indexing strategies.The check constraint
hoo_bounds_inclusive
enforces inclusive boundaries for your ranges, with two noteworthy consequences:f_hoo_hours()
below takes care of this.The check constraint
hoo_standard_week
enforces the outer bounds of the staging week using the "range is contained by" operator<@
.With inclusive bounds, you have to observe a corner case where the time wraps around at Sunday midnight:
You have to search for both timestamps at once. Here is a related case with exclusive upper bound that wouldn't exhibit this shortcoming:
Function
f_hoo_time(timestamptz)
To "normalize" any given
timestamp with time zone
:PARALLEL SAFE
only for Postgres 9.6 or later.The function takes
timestamptz
and returnstimestamp
. It adds the elapsed interval of the respective week($1 - date_trunc('week', $1)
in UTC time to the starting point of our staging week. (date
+interval
producestimestamp
.)Function
f_hoo_hours(timestamptz, timestamptz)
To normalize ranges and split those crossing Mon 00:00. This function takes any interval (as two
timestamptz
) and produces one or two normalizedtsrange
values. It covers any legal input and disallows the rest:To
INSERT
a single input row:For any number of input rows:
Each can insert two rows if a range needs splitting at Mon 00:00 UTC.
Query
With the adjusted design, your whole big, complex, expensive query can be replaced with ... this:
For a little suspense I put a spoiler plate over the solution. Move the mouse over it.
The query is backed by said GiST index and fast, even for big tables.
fiddle* (with more examples)
Old sqlfiddle
If you want to calculate total opening hours (per shop), here is a recipe:
Index and Performance
The containment operator for range types can be supported with a GiST or SP-GiST index. Either can be used to implement an exclusion constraint, but only GiST supports multicolumn indexes:
And the order of index columns matters:
So we have conflicting interests here. For big tables, there will be many more distinct values for
shop_id
than forhours
.shop_id
is faster to write and to enforce the exclusion constraint.hours
in our query. Having that column first would be better.shop_id
in other queries, a plain btree index is much faster for that.hours
to be fastest for the query.Benchmark
New test with Postgres 12 on an old laptop. My script to generate dummy data:
Results in ~ 141k randomly generated rows, ~ 30k distinct
shop_id
, ~ 12k distincthours
. Table size 8 MB.I dropped and recreated the exclusion constraint:
shop_id
first is ~ 4x faster for this distribution.In addition, I tested two more for read performance:
After
VACUUM FULL ANALYZE hoo;
, I ran two queries:Results
Got an index-only scan for each (except for "no index", of course):
If you read a lot more than you write (typical use case), keep the exclusion constraint as suggested at the outset and create an additional SP-GiST index to optimize read performance.