In a DataFrame, I have multiple Date columns as below.
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| | order_id | dateColumn1 | dateColumn2 | dateColumn3 | dateColumn4 | dateColumn5 | dateColumn6 | dateColumn7 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| 0 | 22161577 | 2019-03-26 | | | 2019-03-24 | | 2019-03-23 | 2019-03-26 |
| 1 | 22719649 | 2019-05-22 | | 2019-05-21 | 2019-05-28 | | 2019-05-24 | 2019-05-28 |
| 2 | 23693247 | 2019-08-13 | | 2019-08-02 | 2019-08-13 | | 2019-08-10 | 2019-08-07 |
| 3 | 24269635 | 2019-09-23 | | 2019-10-04 | 2019-09-25 | | 2019-10-04 | |
| 4 | 24887829 | 2019-11-12 | | 2020-11-27 | 2020-11-27 | | 2020-11-27 | |
| 5 | 25325867 | 2019-12-19 | | 2019-12-19 | 2019-12-19 | | 2019-12-19 | |
| 6 | 26096549 | 2020-03-18 | | 2020-06-25 | 2020-04-20 | | 2020-06-25 | 2020-06-25 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
Trying to find the latest date among the given dates in each row and update it on a new column as below. In few cases, dateColumns 2 and 5 also have dates.
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| | order_id | dateColumn1 | dateColumn2 | dateColumn3 | dateColumn4 | dateColumn5 | dateColumn6 | dateColumn7 | latestDate |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| 0 | 22161577 | 2019-03-26 | | | 2019-03-24 | | 2019-03-23 | 2019-03-26 | 2019-03-26 |
| 1 | 22719649 | 2019-05-22 | | 2019-05-21 | 2019-05-28 | | 2019-05-24 | 2019-05-28 | 2019-05-28 |
| 2 | 23693247 | 2019-08-13 | | 2019-08-02 | 2019-08-13 | | 2019-08-10 | 2019-08-07 | 2019-08-13 |
| 3 | 24269635 | 2019-09-23 | | 2019-10-04 | 2019-09-25 | | 2019-10-04 | | 2019-10-04 |
| 4 | 24887829 | 2019-11-12 | | 2020-11-27 | 2020-11-27 | | 2020-11-27 | | 2020-11-27 |
| 5 | 25325867 | 2019-12-19 | | 2019-12-19 | 2019-12-19 | | 2019-12-19 | | 2019-12-19 |
| 6 | 26096549 | 2020-03-18 | | 2020-06-25 | 2020-04-20 | | 2020-06-25 | 2020-06-25 | 2020-06-25 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
Do we’ve any in-build function in Pandas DataFrame to find the latest date, comparing the multiple date columns. Verified this question. Yet, looking for alternative method or function to compare multiple date column values directly and get the most recent date. The DataFrame contains approximately 25K rows.
following function should serve the purpose.
function will check dates (row wise) for latest one and writes into a new column.
i have checked it at online, have a visit if need, Online Editor.