I need to create an empty time table series for a report so I can left join activity from several tables to it. Every hour of the day does not necessarily have data, but I want it to show null or zero for inactivity instead of omitting that hour of the day.

In later versions of Postgres (post 8.0.2), this is easy in several ways:

SELECT unnest(array[0,1,2,3,4...]) as numbers

OR

CROSS JOIN (select generate_series as hours from generate_series(now()::timestamp, now()::timestamp + interval '1 day', '1 hour'::interval )) date_series

Redshift can run some of these commands, but throws an error when you attempt to run it in conjunction with any of the tables.

WHAT I NEED:

A reliable way to generate a series of numbers (e.g. 0-23) as a subquery that will run on redshift (uses postgres 8.0.2).

4

There are 4 best solutions below

3
On BEST ANSWER

As long as you have a table that has more rows than your required series has numbers, this is what has worked for me in the past:

select
    (row_number() over (order by 1)) - 1 as hour
from
    large_table
limit 24
;

Which returns numbers 0-23.

0
On

Recursion was released for Redshift in April 2021. Now that recursion is possible in Redshift. You can generate series of numbers (or even table) with below code

with recursive numbers(NUMBER) as
(
select 1 UNION ALL
select NUMBER + 1 from numbers where NUMBER < 28
)
2
On

I'm not a big fan of querying a system table just to get a list of row numbers. If it's something constant and small enough like hours of a day, I would go with plain old UNION ALL:

WITH 
  hours_in_day AS (
    SELECT 0 AS hour
    UNION ALL SELECT 1
    UNION ALL SELECT 2
    UNION ALL SELECT 3
    UNION ALL SELECT 4
    ...
    UNION ALL SELECT 23
  )

And then joinhours_in_day to whatever you want to.

2
On

Unfortunately, Amazon Redshift does not allow use of generate_series() for table functions. The workaround seems to be creating a table of numbers.

See also: