Concatenating Dataframes and if there is an 'in place' TfidfVectorizer

21 Views Asked by Abhijit Singh At 26 July 2023 at 10:31

I have been given the task of making a sentiment prediction model based on movie reviews. The train dataset along with the feature movie_review contains other features such as movie_name, release_date etc. and the sentiment (positive or negative).

I have vectorized the the feature movie_review using the TfidfVectorizer() function of sklearn. Now I am trying to concatenate two dataframes :-

The train dataset which is of shape (156311, 5)
The dataframe which I got as the output of TfidfVectorizer() after vectorizing the movie_review feature column of the train dataset. Shape of this dataframe is (156311 × 65220)

In order to concatenate the two dataframes, I use the following function,

pd.concat([train, review_vectorized], axis=1)

The problem is that every time I try to run time the function, the RAM memory runs out and the google collab crashes.

Is there a more efficient way of concatenating dataframes? Or even better, is there a way to vectorize the textual column 'in-place'? So as we wouldn't need to create a separate dataframe with the vectorized text and the concatenate with the original dataframe?

Original Q&A

Concatenating Dataframes and if there is an 'in place' TfidfVectorizer

There are 0 best solutions below

Related Questions in PANDAS

Related Questions in SCIKIT-LEARN

Related Questions in SKLEARN-PANDAS

Related Questions in TFIDFVECTORIZER

Trending Questions

Popular # Hahtags

Popular Questions