I have been trying to find my way to navigate through a column of a dataframe i have on bigquery. The dateframe is a sort of 2 column, 1st column is the ID (same ID for all of the rows) and the second column represents the date for each ID (each date is distinct)
| Serial_ID | Date |
|---|---|
| 5559 | 2023-08-31 |
| 5559 | 2023-08-05 |
| 5559 | 2023-07-26 |
| 5559 | 2022-05-13 |
| 5559 | 2022-05-07 |
| 5559 | 2020-08-22 |
Now i want to navigate through every row and do a comparaison query. Exemple first row : i want to take the first row, having Date = 2023-08-31. We will have to start navigating the column and compare the same date to other rows each time, stopping the navigation whe nthe date difference is more than a month. And so on until we do the same for the last row.
The problem i face is that this step should be in the middle of a structure like this, inside a substructure like this :
WITH tmp1 AS ( .... ), tmp2 AS ( .... ), tmp3 AS ( .... ), tmp4 AS ( our FOR loop is supposed to be inside here ), tmp5 AS ( .... )
SELECT * FROM tmp5
I know this is complicated for a tool like bigquery but i am i na situation where i must do it here, on bigquery only. This is a typical easy python code, however i find no clues to apply it on bigquery.
I have tried to implement the FOR loop , while, LOOP, but that does not work in a substructure unfoturnately. I have also tried to nest a full column of dates inside a row then try to select a row, compare the value in the date column, with the other dates nested in the array facing it, like this :
| Serial_ID | Date | Array_of_dates |
|---|---|---|
| 5559 | 2023-08-31 | [2023-08-31, 2023-08-05, 2023-07-26, 2022-05-13, 2022-05-07, 2020-08-22] |
| 5559 | 2023-08-05 | [2023-08-31, 2023-08-05, 2023-07-26, 2022-05-13, 2022-05-07, 2020-08-22] |
| 5559 | 2023-07-26 | [2023-08-31, 2023-08-05, 2023-07-26, 2022-05-13, 2022-05-07, 2020-08-22] |
| 5559 | 2022-05-13 | [2023-08-31, 2023-08-05, 2023-07-26, 2022-05-13, 2022-05-07, 2020-08-22] |
| 5559 | 2022-05-07 | [2023-08-31, 2023-08-05, 2023-07-26, 2022-05-13, 2022-05-07, 2020-08-22] |
| 5559 | 2020-08-22 | [2023-08-31, 2023-08-05, 2023-07-26, 2022-05-13, 2022-05-07, 2020-08-22] |
But in order to compare the date with each one of the elements inside array_of_dates, i still have to go through a loop...