I am trying to create a function that will return rows pertaining only to particular month and year:

df

order_date      Type
2015-01-01      A
2017-09-01      A
2016-12-19      C
2019-11-23      D
2018-10-29      B
2017-12-31      B
2015-11-30      A
2015-08-30      B
2015-09-24      D
2015-01-27      E

Defining function

def return_data_month_year(month, year):
    month = pd.to_datetime(month).month()
    year = pd.to_datetime(year).year()
    df = df[((df['order_date']).dt.strftime('%B') == month)&((df['order_date']).dt.strftime('%Y') == 
    year)]
    return df

Calling function

  return_data_month_year('Jan','2015')

Expected output:

order_date      Type
2015-01-01      A
2015-01-27      E

I am getting error(Output):

   OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00
1

There are 1 best solutions below

2
On BEST ANSWER

You don't have to call month = pd.to_datetime(month).month() and year = pd.to_datetime(year).year().

Also '%B' returns full month name, eg. January. To return only abbreviation (Jan, Feb, ...), use %b:

def return_data_month_year(df, month, year):
    return df[((df['order_date']).dt.strftime('%b') == month)&((df['order_date']).dt.strftime('%Y') == year)]

# to convert column 'order_date' to datetime:
df['order_date'] = pd.to_datetime( df['order_date'] )

print( return_data_month_year(df, 'Jan','2015') )

Prints:

  order_date Type
0 2015-01-01    A
9 2015-01-27    E