pyjanitor.pivot_longer: Unpivot multiple sets of columns with common prefix

41 Views Asked by At

Reprex csv:

col,ref_number_of_rows,ref_count,ref_unique,cur_number_of_rows,cur_count,ref_unique
region,2518,2518,42,212,212,12
country,2518,2518,6,212,212,2
year,2518,2518,15,212,212,15

I want to unpivot the dataset, where a typecolumn contains the prefix of each column string: (cur|ref). My solution below does not match the first part of the string before _ to fill the type column, though it does the rest.

column_summary_frame \
    .pivot_longer(
        column_names="*_*",
        names_to = ("type", ".value"),  
        names_sep = r"^[^_]+(?=_)")
1

There are 1 best solutions below

0
prayner On
column_summary_frame \
    .pivot_longer(
        column_names="*_*",
        names_to=("type", ".value"),  
        names_sep=r"(?<=ref|cur)_")

I forgot that names_sep needed to match the exact symbol where you want the split to occur. In this case, the first underscore of the string.