I want to predict sequences using Sequential model of Keras. My dataframe contains string data, so that I decided to use LabelEncoder
from sklearn library to encode the string data.
I tried this code snippet:
import pandas as pd
df = pd.read_csv("sample-03.csv")
from sklearn.preprocessing import LabelEncoder
df.apply(LabelEncoder().fit_transform)
giving this result:
This label encoding is applied to each column with different values, i.e. I need to represent http://example.com/296 as "2" for the whole dataset. I would be grateful to be suggested by a solution.
I also tried to convert the dataset to tuples and use a dictionary for this dataset but again the key is not unique for the same value in different columns.
I came up with the solution and would like to share it here.