Pandas: How to filter column information in Dataframe and process it differently

61 Views Asked by At

I have a data frame representing the schedule of some restaurants in a week.

  • What i want to do is to add a column is_OpenSaturday to my initial Dataframe df that represents If the restaurant is open on Saturdays (boolean value: 0 or 1)

note : the value 0 or 0:0-0:0 simply means that the restaurant is closed that day.

data = {
        'restaurant_id':  ['1', '2','3'],
        'Monday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Tuesday':  ['11:0-20:0', '11:30-22:0','11:30-22:0'],
        'Wednesday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Thursday':  ['11:0-20:0', '11:30-22:0','11:30-21:0'],
        'Friday':  ['11:0-22:0', '11:30-22:0','11:30-21:0'],
        'Saturday':  ['11:0-22:0', '0:0-0:0',0],
        'Sunday':  ['11:0-17:0', '16:30-21:30','11:30-21:0',],

        }

df = pd.DataFrame (data, columns = ['restaurant_id','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday',])

What could be a simple syntax to achieve this?

3

There are 3 best solutions below

0
On BEST ANSWER

(One of) the (syntactically) simplest way is to apply a lambda function on elements of the Saturday column:

df["is_OpenSaturday"] = df["Saturday"].apply(lambda el: 0 if el in ('0:0-0:0', '0') else 1)

Output:

df["is_OpenSaturday"]
Out[5]: 
0    1
1    0
2    0
Name: is_OpenSaturday, dtype: int64

The lambda function says that if its string value is '0:0-0:0' or '0', then the restaurant is closed (return 0). Otherwise it is open (return 1).

Note that one must use df["is_OpenSaturday"] = but not df.is_OpenSaturday = on the left hand side. The is_OpenSaturday column will be created automatically only using the [] operator.

Also note that, efficiency-wise, .apply() is the last option you would like to try. A list(map()) construct is better in general if you have a lot of data. The equivalent syntax is shown as follows.

df["is_OpenSaturday"] = list(map(lambda el: 0 if el in ('0:0-0:0', '0') else 1, df["Saturday"]))
0
On

Use regex to check 0:0-0:0 or any string that begins with zero

df['is_OpenSaturday']=np.where(df.Saturday.str.contains('0\:0\-0\:0|^0'),0,1)

    restaurant_id      Monday     Tuesday   Wednesday    Thursday      Friday  \
0             1   11:0-20:0   11:0-20:0   11:0-20:0   11:0-20:0   11:0-22:0   
1             2  11:30-22:0  11:30-22:0  11:30-22:0  11:30-22:0  11:30-22:0   
2             3  11:30-21:0  11:30-22:0  11:30-21:0  11:30-21:0  11:30-21:0   

    Saturday       Sunday  is_OpenSaturday  
0  11:0-22:0    11:0-17:0                1  
1    0:0-0:0  16:30-21:30                0  
2          0   11:30-21:0                0  
0
On

Use str.findall

df.Saturday.str.findall(r'[^0:-]').astype(bool).astype(int)

Out[267]:
0    1
1    0
2    0
Name: Saturday, dtype: int32

Or

df.Saturday.str.findall(r'[1-9]').astype(bool).astype(int)

Out[276]:
0    1
1    0
2    0
Name: Saturday, dtype: int32