My question is regarding Python's sklearn.model_selection.train_test_split method (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html), to be more precise: its return type.
How do I know what exactly is returned reading this documentation? (it just says "splitting: list, length=2 * len(arrays), List containing train-test split of inputs.")
Without looking at the examples below I would not have known that it makes sense to call it like X_train, X_test, y_train, y_test = train_test_split(...). How would I have known that?
Furthermore I was quite surprised that after inserting a data frame the result is not a data frame any longer and the column names are gone.
Do you also have general advice on how to read Python documentations?
I don't think the function is limited to the examples. It allows an arbitrary number of indexables and returns "2 x number of indexables" as can be seen:
So, your output can also be an arbitrary number of elements e.g.
a,b,c,d,e,f = train_test_split(X,y,z)
also runs successfully. Regarding your other point, scikit-learn uses Numpy for speeding up computations as pandas is quite slow. However, I agree that documentation would be better mentioning the return type