I have a 3rd party clocking application which we use to track our staff's movements. It runs on a MariaDB, but it is pretty messy (imo). The frontend is windows based and all the processing happens there. Problem is I would like to extract some data for realtime reporting.
All clocking activities are saved to a single table. See sample below.
INDX _DATE_ _TIME_ SURNAME WIEGANDID ZKINOUT
51 2018/09/03 5:52:04 Thakadu 000000000000AE 0
198 2018/09/03 14:04:29 Thakadu 000000000000AE 1
309 2018/09/03 21:54:06 Mabeo 000000000000BA 0
370 2018/09/04 5:47:20 Thakadu 000000000000AE 0
401 2018/09/04 6:00:09 Mabeo 000000000000BA 1
557 2018/09/04 14:04:57 Thakadu 000000000000AE 1
691 2018/09/04 21:53:33 Mabeo 000000000000BA 0
748 2018/09/05 5:47:20 Thakadu 000000000000AE 0
780 2018/09/05 6:00:34 Mabeo 000000000000BA 1
946 2018/09/05 14:05:32 Thakadu 000000000000AE 1
1089 2018/09/05 21:49:48 Mabeo 000000000000BA 0
1144 2018/09/06 5:50:41 Thakadu 000000000000AE 0
1174 2018/09/06 6:00:16 Mabeo 000000000000BA 1
1328 2018/09/06 14:09:28 Thakadu 000000000000AE 1
1482 2018/09/06 21:50:32 Mabeo 000000000000BA 0
1568 2018/09/07 5:58:48 Thakadu 000000000000AE 0
1555 2018/09/07 6:01:01 Mabeo 000000000000BA 1
1812 2018/09/07 14:05:47 Thakadu 000000000000AE 1
1845 2018/09/07 21:51:31 Mabeo 000000000000BA 0
The challenge comes on where Mabeo works from 22:00:00 till 06:00:00 the next morning. Also, sometime staff don't clock out for some reason, the system then automatically logs them out, without logging a time.
The result I would like to see looks something like this.
DATE_IN TIME_IN DATE_OUT TIME_OUT SURNAME WIEGANDID
2018/09/03 05:52:04 2018/09/03 14:04:29 Thakadu 000000000000AE
2018/09/03 21:54:06 2018/09/04 06:00:09 Mabeo 000000000000BA
2018/09/04 05:47:20 2018/09/04 14:04:57 Thakadu 000000000000AE
2018/09/04 21:53:33 2018/09/05 06:00:16 Mabeo 000000000000BA
This way I can work out the actual time each emply was clocked in over a period of time.
I was able to get some success using GROUP BY and CASE, but the problem comes in on the employee working night shift.
Any help would be appreciated.
---------------------------------UPDATE-------------------------------
OK, so thank you very much to all who contributed. I've almost got the answer, except not yet 100% there. I've used the following code as suggested by @rf1234, thank you for the answer.
SELECT COALESCE (a.DATE_IN, b.DATE_IN) AS DATE_IN,
SUBSTR(COALESCE (a.TIME_IN, b.TIME_IN), 1, 8) AS TIME_IN,
CASE WHEN ( b.DATE_TIME_OUT > a.DATE_TIME_IN ) THEN COALESCE (a.DATE_OUT,
b.DATE_OUT) ELSE '' END AS DATE_OUT,
CASE WHEN ( b.DATE_TIME_OUT > a.DATE_TIME_IN ) THEN
SUBSTR(COALESCE (a.TIME_OUT, b.TIME_OUT), 1, 8) ELSE '' END
AS TIME_OUT,
COALESCE (a.SURNAME, b.SURNAME) AS SURNAME,
COALESCE (a.WIEGANDID, b.WIEGANDID) AS WIEGANDID,
CASE WHEN ( b.DATE_TIME_OUT > a.DATE_TIME_IN ) THEN
TIMESTAMPDIFF(SECOND, a.DATE_TIME_IN, b.DATE_TIME_OUT) / 3600 ELSE '' END
AS HOURS_WORKED
FROM (
SELECT _DATE_ AS DATE_IN,
_TIME_ AS TIME_IN,
NULL AS DATE_OUT,
NULL AS TIME_OUT,
SURNAME,
WIEGANDID,
CONCAT( _DATE_, ' ', _TIME_ ) AS DATE_TIME_IN
FROM _2018_09
WHERE ZKINOUT = 0 AND SURNAME = 'MABEO'
GROUP BY WIEGANDID, _DATE_ ) AS a,
(
SELECT NULL AS DATE_IN,
NULL AS TIME_IN,
_DATE_ AS DATE_OUT,
_TIME_ AS TIME_OUT,
SURNAME,
WIEGANDID,
CONCAT( _DATE_, ' ', _TIME_ ) AS DATE_TIME_OUT
FROM _2018_09
WHERE ZKINOUT = 1 AND SURNAME = 'MABEO'
GROUP BY WIEGANDID, _DATE_) AS b
WHERE a.WIEGANDID = b.WIEGANDID
ORDER BY 1, 2, 3, 4
I've modified the code to only select a single employee, for now. The result I get is almost what I want, except it seems to join each record of table a to each record of table b? Below is image of a sample of the result.
As master ArSuKa is saying you could of course have this a lot easier by redesigning the database. But let's see whether we can get your result just with what you have. We also calculate the hours worked (float number) in case the shift has already ended by diffing the two timestamps in the temporary tables. Using temporary DATE_TIME columns resolves the issue with the employee working night shift. The calculation is right regardless of whether the shift ends on a different day.
This looks a bit complicated and it does the following. Two temporary tables a and b are created. These are the SELECT statements in (). One temporary table contains the login records and the other one the log out records. The respective "missing" fields are set to NULL.
Then these tables can be joined with each other using WIEGANDID. COALESCE makes sure the null values are filtered out and are replaced with the values from the other temporary table that contains the right value. The check for DATE_TIME_OUT to be greater than DATE_TIME_IN makes sure DATE_OUT and TIME_OUT stay empty if the respective worker hasn't logged out yet.
I added a new column to the result table: HOURS_WORKED. It contains the hours worked as a float number. Shifts that haven't ended yet are ignored in the results set (HOURS_WORKED <> ''). Too eliminate duplicates and avoid too much complexity we filter out resultset records whith HOURS_WORKED > 15 because that doesn't occur in reality and there is usually enough time between two shifts that we don't catch two short shifts. Not a general solution but one that works for this particular application.
You could use the above statement in a CREATE VIEW statement. Then you could work with the view instead of the original table which would help reduce complexity.
The picture below shows the result of the query above based on all of your sample records.