SQL: Assigning occasion numbers to GPS points when visiting places based on conditions

59 Views Asked by Beau At 29 March 2024 at 11:02

I have a PostgreSQL table of GPS points from multiple people with the information on the trip and place each of the points is assigned to. I want to find the occasion number at which each point_id is assigned to the place_id where overnight = 'Yes' across all trip_id of each person_id, based on the following conditions:

point_id assigned to the same place_id where overnight = 'Yes' within the same trip_id should be treated as separate occasions when there is at least one point assigned to a different place_id where overnight = 'Yes' anytime in between those points. This means revisits are counted as different occasions.
point_id assigned to the same place_id where overnight = 'Yes' from different trip_id should be treated as separate occasions.

Below is my table. occasion is the expected output field. Any guidance how to achieve this? Thanks!

CREATE TABLE my_table (
    point_id INT,
    trip_id INT,
    person_id INT,
    place_id INT,
    overnight VARCHAR,
    occasion INT
);

INSERT INTO my_table (point_id, trip_id, person_id, place_id, overnight, occasion) VALUES
(1, 5, 1, 9, 'Yes', 1),
(2, 5, 1, 10, 'Yes', 1),
(3, 5, 1, 10, 'Yes', 1),
(4, 5, 1, 9, NULL, NULL),
(5, 5, 1, 9, 'Yes', 2),
(6, 5, 1, 10, 'Yes', 2),
(7, 5, 1, 9, 'Yes', 3),
(8, 5, 1, 10, 'Yes', 3),
(9, 5, 1, 10, 'No', NULL),
(10, 5, 1, 10, 'No', NULL),
(11, 5, 1, 10, 'Yes', 3),
(12, 95, 2, NULL, NULL, NULL),
(13, 95, 2, 9, 'No', NULL),
(14, 96, 2, 9, 'Yes', 1),
(15, 96, 2, NULL, NULL, NULL),
(16, 96, 2, 9, 'Yes', 1),
(17, 97, 2, NULL, NULL, NULL),
(18, 97, 2, NULL, NULL, NULL),
(19, 97, 2, 9, 'Yes', 2),
(20, 98, 2, 9, 'Yes', 3),
(21, 98, 2, 9, 'No', NULL),
(22, 99, 2, 10, 'No', NULL),
(23, 99, 2, 9, 'Yes', 4),
(24, 99, 2, 10, 'No', NULL),
(25, 99, 2, 10, 'Yes', 1);

Edits:

Here is my current query. The output field final_visit_rank is identical to the expected output field occasion. The query seems to work fine, but I'm not sure if it's the most efficient or elegant solution, especially considering that my real dataset has hundreds of thousands of rows.

with t_revisit as (
select  *, 
        case when place_id is not distinct from lag(place_id) over (
            partition by person_id order by point_id rows between unbounded preceding and current row) and 
                trip_id is not distinct from lag(trip_id) over (
                    partition by person_id order by point_id rows between unbounded preceding and current row)
                        then 1
             else 0
        end as revisit
from my_table
where overnight = 'Yes'
), 

t_visit_rank as (
select  *,
        row_number() over (
            partition by person_id, place_id order by trip_id, point_id 
                rows between unbounded preceding and current row) as visit_rank
from t_revisit
where revisit = 0)

select  t.*, visit_rank, 
        case when t.revisit = 1 
                then lag(visit_rank) over (
                    partition by t.person_id, t.place_id, t.overnight order by t.point_id 
                        rows between unbounded preceding and current row)
            else visit_rank
        end as final_visit_rank
from (select my_table.*, revisit
      from my_table
      left join t_revisit
      on my_table.point_id = t_revisit.point_id) t
left join t_visit_rank
on t.point_id = t_visit_rank.point_id
order by person_id, point_id;

Original Q&A

SQL: Assigning occasion numbers to GPS points when visiting places based on conditions

Edits:

There are 0 best solutions below

Related Questions in SQL

Related Questions in POSTGRESQL

Related Questions in GPS

Related Questions in GIS

Related Questions in GAPS-AND-ISLANDS

Trending Questions

Popular # Hahtags

Popular Questions