I have a dataframe which has several columns, so I chose some of its columns to create a variable like this.
xtrain = df[['Age', 'Fare', 'Group_Size', 'deck', 'Pclass', 'Title']]
I want to drop from these columns all rows where the Survive
column in the main dataframe is nan
.
You can pass a boolean mask to your df based on
notnull()
of 'Survive' column and select the cols of interest:Now pass a mask to
loc
to take only nonNaN
rows: