how do i extract the multiple pattern in a column value and put it in new dataframe

36 Views Asked by At

I am unexperienced in coding, need help in this simple code.

I can’t split Index, expirydate, strike and opt type.

I am stuck in splitting expiry date.

**Sample 1**
data = ["NIFTY2431322000PE", "NIFTY2441522000PE"]
df = pd.DataFrame({'details': data})

# Regular expression pattern to match the desired format
pattern = r'([A-Z]+)(\d{2})(\d{1})(\d{2})(\d{5})([A-Z]+)'
df[['Instrument', 'Year', 'month', 'Date', 'strike', 'CEorPE']] = df['details'].str.extract(pattern)
RESULT
             details Instrument Year month Date strike CEorPE
0  NIFTY2431322000PE      NIFTY   24     3   13  22000     PE
1  NIFTY2441522000PE      NIFTY   24     4   15  22000     PE
SAMPLE 2
data = ["NIFTY24MAR22000PE", "NIFTY24DEC22000PE"]
df = pd.DataFrame({'details': data})

pattern = r'([A-Z]+)(\d{2})([A-Z]+)(\d{5})([A-Z]+)'
df[['Instrument', 'Year','month', 'strike', 'CEorPE']] = df['details'].str.extract(pattern)
df['date'] = '30'
Result
             details Instrument Year month strike CEorPE date
0  NIFTY24MAR22000PE      NIFTY   24   MAR  22000     PE   30
1  NIFTY24DEC22000PE      NIFTY   24   DEC  22000     PE   30

I can’t figure out how to run on dataframe which have multiple variation

data = ["NIFTY2431322000PE", "NIFTY24DEC22000PE"]

If possible I want expiry date in 1 column..

thank you in advance

1

There are 1 best solutions below

5
mozway On BEST ANSWER

You could use:

pattern = r'([A-Z]+)(\d{2})(\d{1}|[A-Z]+)(\d{2})?(\d{5})([A-Z]+)'
df[['Instrument', 'Year', 'month', 'date', 'strike', 'CEorPE']
  ] =  df['details'].str.extract(pattern).fillna({'date': 30})

Same with named capturing groups:

pattern = r'(?P<Instrument>[A-Z]+)(?P<Year>\d{2})(?P<month>\d{1}|[A-Z]+)(?P<date>\d{2})?(?P<strike>\d{5})(?P<CEorPE>[A-Z]+)'
df[['Instrument', 'Year', 'month', 'date', 'strike', 'CEorPE']] =  df['details'].str.extract(pattern).fillna({'date': 30})

Output:

             details Instrument Year month date strike CEorPE
0  NIFTY2431322000PE      NIFTY   24     3   13  22000     PE
1  NIFTY24DEC22000PE      NIFTY   24   DEC   30  22000     PE

regex demo