Parse dates for datetype as DDMMMYYYY

840 Views Asked by At

Currently my data has date as 01JAN2017, how do I make pandas understand this as date type, i need the data to be in date type for filtering it for various time frames. i used the below

data=pd.read_csv(input_path + 'data.txt',sep='|', parse_dates=['week'])

but when i checked for the datatype for week it still shows as object.

Would be very helpful if you can also direct me to some other links so that i could read up more about this

2

There are 2 best solutions below

1
On BEST ANSWER

You can use datetime.strptime() to parse a date string into a datetime object:

>>> from datetime import datetime
>>> datetime.strptime("01JAN2017", "%d%b%Y")
>>> datetime.datetime(2017, 1, 1, 0, 0)

Now, to make pandas recognize the format, you could add a date parser function:

dateparse = lambda dates: [pd.datetime.strptime(d, "%d%b%Y") for d in dates]
df = pd.read_csv(infile, parse_dates=["week"], date_parser=dateparse)
0
On

In pandas to_datetime method is the natural choice

In [11]: D = {'Date': '01JAN2017'}
In [12]: df = pd.DataFrame(D, index=[0])
In [13]: df
Out[13]: 
        Date
0  01JAN2017
In [14]: df.dtypes
Out[14]: 
Date    object
dtype: object

# Datatime column is read as string
# use to_datetime to convert non-standard datetime values    

In [15]: df['Date'] = pd.to_datetime(df.Date, format='%d%b%Y')
In [16]: df.dtypes
Out[16]: 
Date    datetime64[ns]
dtype: object
In [17]: df['Date']
Out[17]: 
0   2017-01-01
Name: Date, dtype: datetime64[ns]

The document of pd.read_csv actually recommends that you use to_datetime when the datetime format is not a standard format:

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

If a column or index contains an unparseable date, the entire column or index will be returned unaltered as an object data type. For non-standard datetime parsing, use pd.to_datetime after pd.read_csv