How to transform a table into node pairs using SQL

58 Views Asked by At

I have a database structured for user clicks. Every time someone clicks from a website to visit another website, I capture the username, website, and what click in their click-path it was. It looks like this:

+----------+-------+---------------+
| Customer | Click | Website       |
+----------+-------+---------------+
| john     | 1     | facebook      |
+----------+-------+---------------+
| john     | 2     | microsoft     |
+----------+-------+---------------+
| john     | 3     | facebook      |
+----------+-------+---------------+
| steven   | 1     | apple         |
+----------+-------+---------------+
| steven   | 2     | stackoverflow |
+----------+-------+---------------+
| steven   | 3     | cnn           |
+----------+-------+---------------+  

I want to visualize how people move around on the internet. I believe a node path is the best idea, but I'm not sure how to transform this data into that. For node value pairs, the table above must be manipulated to look like this:

+----------+----------------+-----------------+
| Customer | Origin Website | Landing Website |
+----------+----------------+-----------------+
| john     | facebook       | microsoft       |
+----------+----------------+-----------------+
| john     | microsoft      | facebook        |
+----------+----------------+-----------------+
| steven   | apple          | stackoverflow   |
+----------+----------------+-----------------+
| steven   | stackoverflow  | cnn             |
+----------+----------------+-----------------+  

Is this possible to do only using oracle SQL?

2

There are 2 best solutions below

1
On BEST ANSWER

Something like this. I added a column to show the succession of "hops" from one web site to another; otherwise you will know from where to where, but not in what order.

with
     test_data ( Customer, Click, Website ) as (
       select 'john'  , 1, 'facebook'      from dual union all
       select 'john'  , 2, 'microsoft'     from dual union all
       select 'john'  , 3, 'facebook'      from dual union all
       select 'steven', 1, 'apple'         from dual union all
       select 'steven', 2, 'stackoverflow' from dual union all
       select 'steven', 3, 'cnn'           from dual
     )
-- end of test data; the SQL query begins below this line
select *
from (
       select customer, 
              row_number() over (partition by customer order by click) as ord, 
              website as origin_website,
              lead(website) over (partition by customer order by click) as landing_website
       from   test_data
     )
where landing_website is not null
order by customer, ord
;

CUSTOMER      ORD ORIGIN_WEBSITE LANDING_WEBSITE
--------- ------- -------------- ---------------
john            1 facebook       microsoft
john            2 microsoft      facebook
steven          1 apple          stackoverflow
steven          2 stackoverflow  cnn

4 rows selected.
0
On

You can do this using LEAD():

Select  Customer, 
        Website As "Origin Website",
        Lead(Website) Over (Partition By Customer Order By Click) As "Landing Website"
From    YourTable