Convert several columns to epoch pandas

301 Views Asked by At

I am trying to convert the following columns to epoch to prepare for machine learning the rest of my csv contains strings so I am assuming this is the best way to go, i tried to create a numpy array and transform it using datetime , etc but this did not work i have 4 columns i am trying to transform from dd/mm/yyyy to epoch ? I have tried this method

epoch_time = (1/2/2017 - datetime(1/1/1970)).total_seconds()

but i have 4 columns and i want to convert all of them thanks so much

state                       object
car_model                   object
car_make                    object
car_year                   float64
drive_date                  object
id                           int64
register_date       datetime64[ns]
profile_date                object
add_profile_date            object
dtype: object

id: 23432   state:ohio  car_model:ford car_make:fusion car_year:2016 drive_date:1/1/2017 register_date:12/25/2016 profile_date:12/25/2016 add_profile_date: 12/25/2016
1

There are 1 best solutions below

1
On BEST ANSWER

Try this:

Source DF:

In [173]: df
Out[173]:
      id state car_model car_make  car_year drive_date register_date profile_date add_profile_date
0  23432  ohio      ford   fusion      2016   1/1/2017    2016-12-25   12/25/2016       12/25/2016

In [174]: df.dtypes
Out[174]:
id                           int64
state                       object
car_model                   object
car_make                    object
car_year                     int64
drive_date                  object
register_date       datetime64[ns]
profile_date                object
add_profile_date            object
dtype: object

let's select date columns:

In [175]: date_cols = df.columns[df.columns.str.contains('_date')]

In [176]: date_cols
Out[176]: Index(['drive_date', 'register_date', 'profile_date', 'add_profile_date'], dtype='object')

first convert "string" dates to Pandas datetime, then convert it to UNIX epoch

In [177]: for col in date_cols:
     ...:     if df.dtypes.loc[col] == 'object':
     ...:         df[col] = pd.to_datetime(df[col])
     ...:     df[col] = df[col].astype(np.int64) // 10**9
     ...:

In [178]: df
Out[178]:
      id state car_model car_make  car_year  drive_date  register_date  profile_date  add_profile_date
0  23432  ohio      ford   fusion      2016  1483228800     1482624000    1482624000        1482624000

In [179]: df.dtypes
Out[179]:
id                   int64
state               object
car_model           object
car_make            object
car_year             int64
drive_date           int64
register_date        int64
profile_date         int64
add_profile_date     int64
dtype: object