I am working with a large SQL table, 186 columns, 257899 rows, and counting. It gets added to every time something is changed in the main data. This is part of the shadow
tables that allow us to see an audit history of what has happened to different fields.
What I need to do is look at Who updated, When they updated and a particular order of updates. I need to find anytime a certain 3 users updated the main table to go from
- Bill => Active
- Bill => No Bill
- Active => In Progress
What I need to collect is the
- Status the is being switched and what it is switched to (from list above)
- Name of User making the update from the list of 3 users below
- Datetime of the update
- ClaimNumber
Users that I'm looking for I can't give their names so we'll go with:
- User1
- User2
- User3
I have recently learned about doing things in a set based manner in SQL, but I'm not quite up to the take of figuring this out. I know that I could loop through the table and find any updates by the individual users and then loop through those updates to find if there are any that meet the above criteria, but I know that would be slow compared to a set based approved.
I've only just started working on this so I don't have any code yet, but I'll update this question as I attempt different things.
I just don't know how to go about figuring out how to make it a set based problem?
UPDATE
Here is what some sample data looks like:
Creating #TestingData
as the data table and #TestingTable
as the shadow table and then filling with sample data.
create table #TestingTable(Id int,SMState varchar(50),UpdatedBy varchar(50),UpdatedOn datetime,cd_ClaimNumber varchar(50))
create table #TestingData(Id int,SMState varchar(50),UpdatedBy varchar(50),UpdatedOn datetime,cd_ClaimNumber varchar(50))
insert into #TestingData([Id],[UpdatedBy],[UpdatedOn],[cd_ClaimNumber],SMState)
Values (1,'John Doe','2015-10-25 12:00:00',208692,'Bill'),
(2,'John Doe','2015-10-25 12:00:00',297983,'Bill'),
(3,'John Doe','2015-10-25 13:00:00',697920,'Bill'),
(4,'John Doe','2015-10-25 12:00:00',992196,'Bill')
INSERT INTO #TestingTable([Id],[UpdatedBy],[UpdatedOn],[cd_ClaimNumber],SMState)
VALUES
(1,'John Jones','2015-10-24 13:00:00',208692,'No Bill'),
(2,'John Jones','2015-10-24 12:00:00',208692,'Bill'),
(3,'John Jones','2015-10-23 12:00:00',208692,'In Progress'),
(4,'John Jones','2015-10-22 16:00:00',208692,'Active'),
(5,'John Jones','2015-10-22 14:00:00',208692,'No Bill'),
(6,'John Jones','2015-10-22 12:00:00',208692,'Bill'),
(7,'John Jones','2015-10-24 12:00:00',297983,'Bill'),
(8,'John Jones','2015-10-23 14:00:00',297983,'Active'),
(9,'John Jones','2015-10-23 13:00:00',297983,'In Progress'),
(10,'John Jones','2015-10-23 12:00:00',297983,'Active'),
(11,'John Jones','2015-10-22 12:00:00',297983,'Bill'),
(12,'Jeffery Sampson','2015-10-24 13:00:00',697920,'No Bill'),
(13,'Jeffery Sampson','2015-10-24 12:00:00',697920,'Bill'),
(14,'Jeffery Sampson','2015-10-23 12:00:00',697920,'In Progress'),
(15,'Jeffery Sampson','2015-10-22 16:00:00',697920,'Active'),
(16,'Jeffery Sampson','2015-10-22 14:00:00',697920,'No Bill'),
(17,'Jeffery Sampson','2015-10-22 12:00:00',697920,'Bill'),
(18,'Jeffery Sampson','2015-10-24 12:00:00',992196,'Bill'),
(19,'Jeffery Sampson','2015-10-23 14:00:00',992196,'Active'),
(20,'Jeffery Sampson','2015-10-23 13:00:00',992196,'In Progress'),
(21,'Jeffery Sampson','2015-10-23 12:00:00',992196,'Active'),
(22,'Jeffery Sampson','2015-10-22 12:00:00',992196,'Bill')
Here is what the shadow table will look like with the above data:
+----+-------------+-----------------+---------------------+----------------+
| Id | SMState | UpdatedBy | UpdatedOn | cd_ClaimNumber |
+----+-------------+-----------------+---------------------+----------------+
| 1 | No Bill | John Jones | 24.10.2015 13:00:00 | 208692 |
| 2 | Bill | John Jones | 24.10.2015 12:00:00 | 208692 |
| 3 | In Progress | John Jones | 23.10.2015 12:00:00 | 208692 |
| 4 | Active | John Jones | 22.10.2015 16:00:00 | 208692 |
| 5 | No Bill | John Jones | 22.10.2015 14:00:00 | 208692 |
| 6 | Bill | John Jones | 22.10.2015 12:00:00 | 208692 |
| 7 | Bill | John Jones | 24.10.2015 12:00:00 | 297983 |
| 8 | Active | John Jones | 23.10.2015 14:00:00 | 297983 |
| 9 | In Progress | John Jones | 23.10.2015 13:00:00 | 297983 |
| 10 | Active | John Jones | 23.10.2015 12:00:00 | 297983 |
| 11 | Bill | John Jones | 22.10.2015 12:00:00 | 297983 |
| 12 | No Bill | Jeffery Sampson | 24.10.2015 13:00:00 | 697920 |
| 13 | Bill | Jeffery Sampson | 24.10.2015 12:00:00 | 697920 |
| 14 | In Progress | Jeffery Sampson | 23.10.2015 12:00:00 | 697920 |
| 15 | Active | Jeffery Sampson | 22.10.2015 16:00:00 | 697920 |
| 16 | No Bill | Jeffery Sampson | 22.10.2015 14:00:00 | 697920 |
| 17 | Bill | Jeffery Sampson | 22.10.2015 12:00:00 | 697920 |
| 18 | Bill | Jeffery Sampson | 24.10.2015 12:00:00 | 992196 |
| 19 | Active | Jeffery Sampson | 23.10.2015 14:00:00 | 992196 |
| 20 | In Progress | Jeffery Sampson | 23.10.2015 13:00:00 | 992196 |
| 21 | Active | Jeffery Sampson | 23.10.2015 12:00:00 | 992196 |
| 22 | Bill | Jeffery Sampson | 22.10.2015 12:00:00 | 992196 |
+----+-------------+-----------------+---------------------+----------------+
Here is what the data table looks like with the above data:
+----+---------+-----------+---------------------+----------------+
| Id | SMState | UpdatedBy | UpdatedOn | cd_ClaimNumber |
+----+---------+-----------+---------------------+----------------+
| 1 | Bill | John Doe | 25.10.2015 12:00:00 | 208692 |
| 2 | Bill | John Doe | 25.10.2015 12:00:00 | 297983 |
| 3 | Bill | John Doe | 25.10.2015 13:00:00 | 697920 |
| 4 | Bill | John Doe | 25.10.2015 12:00:00 | 992196 |
+----+---------+-----------+---------------------+----------------+
The end result that I'm looking for is something like this:
+---------------+-----------+-----------------+---------------------+-------------+
| CurrentStatus | OldStatus | UpdatedBy | UpdateOn | ClaimNumber |
+---------------+-----------+-----------------+---------------------+-------------+
| No Bill | Bill | John Jones | 22.10.2015 14:00:00 | 208692 |
| In Progress | Active | John Jones | 23.10.2015 12:00:00 | 208692 |
| No Bill | Bill | John Jones | 24.10.2015 13:00:00 | 208692 |
| Active | Bill | John Jones | 23.10.2015 12:00:00 | 297983 |
| In Progress | Active | John Jones | 23.10.2015 13:00:00 | 297983 |
| No Bill | Bill | Jeffery Sampson | 22.10.2015 14:00:00 | 697920 |
| In Progress | Active | Jeffery Sampson | 23.10.2015 12:00:00 | 697920 |
| No Bill | Bill | Jeffery Sampson | 24.10.2015 13:00:00 | 697920 |
| Active | Bill | Jeffery Sampson | 23.10.2015 12:00:00 | 992196 |
| In Progress | Active | Jeffery Sampson | 23.10.2015 13:00:00 | 992196 |
+---------------+-----------+-----------------+---------------------+-------------+
Here is what I've tried:
with Test_1 as
( select ROW_NUMBER() over(order by pdr.cd_ClaimNumber,pdr.UpdatedOn) as RowID,* from #TestingTable pdr)
select t1.cd_ClaimNumber,t1.UpdatedOn, t1.SMState OldStatus, t2.SMState NewStatus, t1.UpdatedBy
from Test_1 t1
join Test_1 t2 on t2.RowID = t1.RowID + 1
where t1.cd_ClaimNumber = t2.cd_ClaimNumber
and t1.SMState != t2.SMState
order by t1.cd_ClaimNumber,t2.UpdatedOn
This gives me this result:
+----------------+---------------------+-------------+-------------+-----------------+
| cd_ClaimNumber | UpdatedOn | OldStatus | NewStatus | UpdatedBy |
+----------------+---------------------+-------------+-------------+-----------------+
| 208692 | 22.10.2015 12:00:00 | Bill | No Bill | John Jones |
| 208692 | 22.10.2015 14:00:00 | No Bill | Active | John Jones |
| 208692 | 22.10.2015 16:00:00 | Active | In Progress | John Jones |
| 208692 | 23.10.2015 12:00:00 | In Progress | Bill | John Jones |
| 208692 | 24.10.2015 12:00:00 | Bill | No Bill | John Jones |
| 297983 | 22.10.2015 12:00:00 | Bill | Active | John Jones |
| 297983 | 23.10.2015 12:00:00 | Active | In Progress | John Jones |
| 297983 | 23.10.2015 13:00:00 | In Progress | Active | John Jones |
| 297983 | 23.10.2015 14:00:00 | Active | Bill | John Jones |
| 697920 | 22.10.2015 12:00:00 | Bill | No Bill | Jeffery Sampson |
| 697920 | 22.10.2015 14:00:00 | No Bill | Active | Jeffery Sampson |
| 697920 | 22.10.2015 16:00:00 | Active | In Progress | Jeffery Sampson |
| 697920 | 23.10.2015 12:00:00 | In Progress | Bill | Jeffery Sampson |
| 697920 | 24.10.2015 12:00:00 | Bill | No Bill | Jeffery Sampson |
| 992196 | 22.10.2015 12:00:00 | Bill | Active | Jeffery Sampson |
| 992196 | 23.10.2015 12:00:00 | Active | In Progress | Jeffery Sampson |
| 992196 | 23.10.2015 13:00:00 | In Progress | Active | Jeffery Sampson |
| 992196 | 23.10.2015 14:00:00 | Active | Bill | Jeffery Sampson |
+----------------+---------------------+-------------+-------------+-----------------+
This doesn't quite work. It gets all changes, not just the ones that I am looking for. I based this off of Programming SQL in a Set-Based Way
If you have proper columns in your audit table, then I see it as prety straightforward query. Of course I expect that claimnumber will not be included here, but there will be some reference to original data and then it will be simple join. But maybe there is more to it depending on your audit table structure.
Another story will be to have index that will support this query.
And this might be another version if your audit table contains only one field containing status of the main record.
With result