Here is my data set:
import pandas as pd
data = {'Name': ['Tom', 'Nick','Jack', 'Ann', 'Jane'],
'group1': ['SRE_high_0101240243', 'ERS_med_140124065', 'SRE_low_110124084' , 'SRE_high_05022484', 'CER_med_11022437023']}
df = pd.DataFrame(data)
df
I want to extract the first 15 or 14 characters from the 'group1' column, depending on the length of the string in 'Group 1'. If the length of the string in column group 1 is 19 then extract the first 15 characters, and if it's not 19 then extract the first 14 characters.
Here's my failed attempt:
def clean_substr(df):
if df['group1'].str.len() == 19:
val = df['group1'].str.slice(0,15)
elif df['group1'].str.len() != 19:
val = df['group1'].str.slice(0,14)
else:
val = "issue"
return val
df['group1_clean'] = df.apply(clean_substr)
display(df)
I'm getting an error when I run this code and not sure whats making it fail. Any help will be greatly appreciated. Thanks

df.apply(clean_substr)applies the function to every column; so in theclean_substrcode, yourdfis actually a series, whose index is the same asdf's. Theredf['group1']would throw aKeyErrorexception.You can do a conditional select: