I've recently stumbled upon a new awesome pendulum library for easier work with datetimes.
In pandas, there is this handy to_datetime() method allowing to convert series and other objects to datetimes:
raw_data['Mycol'] = pd.to_datetime(raw_data['Mycol'], format='%d%b%Y:%H:%M:%S.%f')
What would be the canonical way to create a custom to_<something> method -
in this case to_pendulum() method which would be able to convert Series of date strings directly to Pendulum objects?
This may lead to Series having various interesting capabilities like, for instance, converting a series of date strings to a series of "offsets from now" - human datetime diffs.
After looking through the API a bit, I must say I'm impressed with what they've done. Unfortunately, I don't think
Pendulumandpandascan work together (at least, with the current latest version -v0.21).The most important reason is that
pandasdoes not natively supportPendulumas a datatype. All the natively supported datatypes (np.int,np.floatandnp.datetime64) all support vectorisation in some form. You are not going to get a shred of performance improvement using a dataframe over, say, a vanilla loop and list. If anything, callingapplyon aSerieswithPendulumobjects is going to be slower (because of all the API overheads).Another reason is that
Pendulumis a subclass ofdatetime-This is important, because, as mentioned above,
datetimeis a supported datatype, so pandas will attempt to coercedatetimeto pandas' native datetime format -Timestamp. Here's an example.So, with some difficulty (involving
dtype=object), you could loadPendulumobjects into dataframes. Here's how you'd do that -However, this is essentially useless, because calling any
pendulummethod (viaapply) will now not only be super slow, but will also end up in the result being coerced toTimestampagain, an exercise in futility.