I have been running in circles with a query that is driving me nuts.
The background:
I have two tables, and unfortunately, both have duplicate records. (Dealing with activity logs if that puts it into perspective). Each table comes from a different system and I am trying to join the data together to get a sudo full picture (I realize that I won't get a perfect view because there is no "event key" shared between the two systems; I am attempting to match on a composite of meta data).
Here is what I am working with:
Table1
------------
JobID CustID Name ActionDate IsDuplicate
12345 11111 Ryan 1/1/2015 01:20:20 False
12345 11112 Bob 1/1/2015 02:10:20 False
12345 11111 Ryan 1/1/2015 04:15:35 True
12346 11113 Jim 1/1/2015 05:10:40 False
12346 11114 Jeb 1/1/2015 06:10:40 False
12346 11111 Ryan 1/1/2015 07:10:30 False
Table2
------------
ResponseID CustID ActionDate Browser
11123 10110 12/1/2014 23:32:15 IE
12345 11111 1/1/2015 03:20:20 IE
12345 11112 1/1/2015 05:10:20 Firefox
12345 11111 1/1/2015 06:15:35 Firefox
12346 11113 1/1/2015 07:10:40 Chrome
12346 11114 1/1/2015 08:10:40 Chrome
12346 11111 1/1/2015 10:10:30 Safari
12213 11123 2/1/2015 01:10:30 Chrome
Please note a few things:
- JobID and ResponseID are the same thing
- JobID and ResponseID are indicators of an event on the site (people are responding to an event)
- Action date does not match (system 2 has about an inconsistent 2 hour delay on it but never more that 3 hours delay)
- Note Table2 doesnt have a duplicate flag
- table 1 (~2,000 records) is significantly smaller than table 2 (~16,000 records)
- Note Cust 11111 is bopping around on browsers, taking the same action twice on job 12345 at different times and only taking action once on job 12346
What I am looking for:
Result (ideal)
------------
t1.JobID t1.CustID t1.Name t1.ActionDate t2.Browser
12345 11111 Ryan 1/1/2015 01:20:20 IE
12345 11112 Bob 1/1/2015 02:10:20 Firefox
12345 11111 Ryan 1/1/2015 04:15:35 Firefox
12346 11113 Jim 1/1/2015 05:10:40 Chrome
12346 11114 Jeb 1/1/2015 06:10:40 Chrome
12346 11111 Ryan 1/1/2015 07:10:30 Safari
Note that I JUST want matches for records in Table1. I am getting tons of duplicates because of the join...Which is frustrating.
Here is what I have so far (which I can humbly can say; isn't really close):
SELECT
t1.JobID,
t1.CustID,
t1.Name,
t1.ActionDate,
t2.Browser
FROM
Table1 t1
LEFT OUTER JOIN
Table2 t2
ON
t1.JobID=t2.ResponseID AND
t1.CustID=t2.CustID AND
DATEPART(dd,t1.ActionDate)=DATEPART(dd,t2.ActionDate)
Try changing the join condition for the date to check that t2.actiondate fulfills the condition
t1.actiondate <= t2.actiondate <= t1.actiondate + 3 hours
With your sample data the result of this query matches your desired result.