In a DataFrame, I have multiple Date columns as below.
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| | order_id | dateColumn1 | dateColumn2 | dateColumn3 | dateColumn4 | dateColumn5 | dateColumn6 | dateColumn7 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| 0 | 22161577 | 2019-03-26 | | | 2019-03-24 | | 2019-03-23 | 2019-03-26 |
| 1 | 22719649 | 2019-05-22 | | 2019-05-21 | 2019-05-28 | | 2019-05-24 | 2019-05-28 |
| 2 | 23693247 | 2019-08-13 | | 2019-08-02 | 2019-08-13 | | 2019-08-10 | 2019-08-07 |
| 3 | 24269635 | 2019-09-23 | | 2019-10-04 | 2019-09-25 | | 2019-10-04 | |
| 4 | 24887829 | 2019-11-12 | | 2020-11-27 | 2020-11-27 | | 2020-11-27 | |
| 5 | 25325867 | 2019-12-19 | | 2019-12-19 | 2019-12-19 | | 2019-12-19 | |
| 6 | 26096549 | 2020-03-18 | | 2020-06-25 | 2020-04-20 | | 2020-06-25 | 2020-06-25 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
Trying to find the latest date among the given dates in each row and update it on a new column as below. In few cases, dateColumns 2 and 5 also have dates.
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| | order_id | dateColumn1 | dateColumn2 | dateColumn3 | dateColumn4 | dateColumn5 | dateColumn6 | dateColumn7 | latestDate |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| 0 | 22161577 | 2019-03-26 | | | 2019-03-24 | | 2019-03-23 | 2019-03-26 | 2019-03-26 |
| 1 | 22719649 | 2019-05-22 | | 2019-05-21 | 2019-05-28 | | 2019-05-24 | 2019-05-28 | 2019-05-28 |
| 2 | 23693247 | 2019-08-13 | | 2019-08-02 | 2019-08-13 | | 2019-08-10 | 2019-08-07 | 2019-08-13 |
| 3 | 24269635 | 2019-09-23 | | 2019-10-04 | 2019-09-25 | | 2019-10-04 | | 2019-10-04 |
| 4 | 24887829 | 2019-11-12 | | 2020-11-27 | 2020-11-27 | | 2020-11-27 | | 2020-11-27 |
| 5 | 25325867 | 2019-12-19 | | 2019-12-19 | 2019-12-19 | | 2019-12-19 | | 2019-12-19 |
| 6 | 26096549 | 2020-03-18 | | 2020-06-25 | 2020-04-20 | | 2020-06-25 | 2020-06-25 | 2020-06-25 |
+---+----------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
Do we’ve any in-build function in Pandas DataFrame to find the latest date, comparing the multiple date columns. Verified this question. Yet, looking for alternative method or function to compare multiple date column values directly and get the most recent date. The DataFrame contains approximately 25K rows.
df:
get max date in each row (in only date columns):
df: