This very simple piece of code,

# imports...
from lifelines import CoxPHFitter
import pandas as pd

src_file = "Pred.csv"

df = pd.read_csv(src_file, header=0, delimiter=',')
df = df.drop(columns=['score'])

cph = CoxPHFitter()
cph.fit(df, duration_col='Length', event_col='Status', show_progress=True)

produces an error:

Traceback (most recent call last): File "C:/Users/.../predictor.py", line 11, in cph.fit(df, duration_col='Length', event_col='Status', show_progress=True)

File "C:\Users\...\AppData\Local\conda\conda\envs\hrpred\lib\site-packages\lifelines\fitters\coxph_fitter.py", line 298, in fit self._check_values(df)

File "C:\Users\...\AppData\Local\conda\conda\envs\hrpred\lib\site-packages\lifelines\fitters\coxph_fitter.py", line 323, in _check_values cols = str(list(X.columns[low_var]))

File "C:\Users\...\AppData\Local\conda\conda\envs\hrpred\lib\site-packages\pandas\core\indexes\base.py", line 1754, in _ _ getitem _ _

result = getitem(key)

IndexError: boolean index did not match indexed array along dimension 0; dimension is 88 but corresponding boolean dimension is 76

However, when I print df itself, everything's all right. As you can see, everything is inside the library. And the library's examples work fine.

1

There are 1 best solutions below

0
On

Without knowing what your data look like - I had the same error, which was resolved when I removed all but the duration, event and coefficient(s) from the pandas df I was using. That is, I had a lot of extra columns in the df that were confusing the cox PH fitter since you don't actually specify which coef you want to include as an argument to cph.fit().