Joining multiple tables with dbt cte's

212 Views Asked by At

So i have this table with is duckdb sql.

fan_id attendance_season_1
1 1
2 0
3 0
4 1

Now I have 15 of these tables for each season. Now I want to merge these all on the fan_id so I have the following table.

fan_id attendance_season_1 attendance_season_2 attendance_season_3
1 1 1 1
2 0 0 1
3 0 0 1
4 1 0 1

And this for 15 attendance rows. For the moment my query looks like this.

season_attendances as (
    select distinct fan_id,
    attendance_season_1, attendance_season_2, attendance_season_3,
    attendance_season_4, attendance_season_5, attendance_season_6,
    attendance_season_7, attendance_season_8, attendance_season_9,
    attendance_season_10, attendance_season_11, attendance_season_12,
    attendance_season_13, attendance_season_14, attendance_season_15
    from join_attendances_1 inner join join_attendances_2 using (fan_id)
    inner join join_attendances_3 using (fan_id)
    inner join join_attendances_4 using (fan_id)
    inner join join_attendances_5 using (fan_id)
    inner join join_attendances_6 using (fan_id)
    inner join join_attendances_7 using (fan_id)
    inner join join_attendances_8 using (fan_id)
    inner join join_attendances_9 using (fan_id)
    inner join join_attendances_10 using (fan_id)
    inner join join_attendances_11 using (fan_id)
    inner join join_attendances_12 using (fan_id)
    inner join join_attendances_13 using (fan_id)
    inner join join_attendances_14 using (fan_id)
    inner join join_attendances_15 using (fan_id)
),

Some explanation of the code. This is an CTE from dbt. the attendance_season_{n} is the statistic 0 or 1 and join_attendance is the first table as seen in this post. My sql engine is duckdb. When querying the single tables it runs in 0.3 seconds but when running the CTE with all the joins it takes forever. Anyone ideas for how to improve this so it runs fast with the same output as the second table in this post.

1

There are 1 best solutions below

0
On BEST ANSWER

Found a solution. Just put it in a subquery like this.

season_attendance_sub as (
    select distinct fan_id,
    (select attendance_season_1
    from join_attendances_1 a
    where fan_id = a.fan_id) as 'attendance_season_1',
    (select attendance_season_2
    from join_attendances_2 a
    where fan_id = a.fan_id) as 'attendance_season_2',
    (select attendance_season_3
    from join_attendances_3 a
    where fan_id = a.fan_id) as 'attendance_season_3',
    (select attendance_season_4
    from join_attendances_4 a
    where fan_id = a.fan_id) as 'attendance_season_4',
    (select attendance_season_5
    from join_attendances_5 a
    where fan_id = a.fan_id) as 'attendance_season_5',
    (select attendance_season_6
    from join_attendances_6 a
    where fan_id = a.fan_id) as 'attendance_season_6',
    (select attendance_season_7
    from join_attendances_7 a
    where fan_id = a.fan_id) as 'attendance_season_7',
    (select attendance_season_8
    from join_attendances_8 a
    where fan_id = a.fan_id) as 'attendance_season_8',
    (select attendance_season_9
    from join_attendances_9 a
    where fan_id = a.fan_id) as 'attendance_season_9',
    (select attendance_season_10
    from join_attendances_10 a
    where fan_id = a.fan_id) as 'attendance_season_10',
    (select attendance_season_11
    from join_attendances_11 a
    where fan_id = a.fan_id) as 'attendance_season_11',
    (select attendance_season_12
    from join_attendances_12 a
    where fan_id = a.fan_id) as 'attendance_season_12',
    (select attendance_season_13
    from join_attendances_13 a
    where fan_id = a.fan_id) as 'attendance_season_13',
    (select attendance_season_14
    from join_attendances_14 a
    where fan_id = a.fan_id) as 'attendance_season_14',
    (select attendance_season_15
    from join_attendances_15 a
    where fan_id = a.fan_id) as 'attendance_season_15'
    from fans
),