Create new columns and calculate values based on condition with date in Python

346 Views Asked by At

I need to create a new column as Billing and Non-Billing based on the Date column.

Condition for Column 1 : If the Start Date is NULL OR BLANK (OR) if its Start Date is in 'Future Date' (OR) if its Starts Date is in 'Past Date' (OR) if its End Date is in Past Date then I should create a new column as Non-Billing.

Condition for columns 2: If the Start Date is in 'Current Date' then need to create a new column as 'Billable' and need to calculate it. Calculation should be in row axis.

Calculation for Billing in row: Billing = df[Billing] * sum/168 * 100

Calculation for Non-Billing in row: Non-Billing = df[Non-Billing] * sum/ 168 * 100

Data:

Employee Name  |    Java  |  Python  | .NET  |  React  |  Start Date   |  End Date    |                                

|Anu           |    10    |     10   |   5   |     5   |  04-21-2021   |              |                                 
|Kalai         |          |     10   |       |     5   |  04-21-2021   |  10-31-2021  |                                 
|Smirthi       |          |     10   |   20  |         |  03-21-2021   |              |                               
|Madhu         |    20    |     10   |   10  |         |  01-12-2021   |              |
|Latha         |    40    |          |   5   |         |               |              |                                 

Input

Input

Output

Output

Code:

# Adding new columns 
total=df.sum(axis=1) 
df.insert(len(df.columns),column='Total',value=total) 
    
# Adding Utilization column utilization = (total/168) 
df.insert(len(df.columns), column='Utilization', value=utilization) 
    
# Filter dataframe using groupby 
df1 = df.groupby(['Employee Name']).sum(min_count=1) 
df1['Available'] = 168
1

There are 1 best solutions below

1
On

I don't understand the conditions very well as there seem to be some inconsistencies but I believe this will help you getting started:

import pandas as pd 
import numpy as np
import datetime

df['Total'] = df.sum(axis=1)
df['Available']=168
df['Amount']=df['Total']/df['Available']*100
df['Billing']=np.NaN
df['NonBilling']=np.NaN
df.loc[df['Start Date']==datetime.date.today(),'Billing']= df['Amount']
df.loc[df['Start Date']!=datetime.date.today(),'NonBilling']= df['Amount']

NOTES:

  1. make sure about the date type to compare against today's date, if your date are being loaded as objects you may want to do something like this after loading:

    df['Start Date']= pd.to_datetime(df['Start Date']).dt.date

  2. work out the conditions for Billing/NonBilling to make sure the columns are being populated as intended