I need to plot some data series in x axis and I have a csv file which "Start time" colum is full of dates.
As I work with DataFrame, I use pandas
library to manipulate the data.
My datetime data is:
Input:
print(paradas["Start time"])
Output:
0 31/12/2020 00:13:30
1 30/12/2020 19:30:00
2 30/12/2020 19:01:45
3 30/12/2020 19:00:10
4 30/12/2020 18:55:35
...
10704 02/01/2020 08:37:33
10705 02/01/2020 08:32:33
10706 02/01/2020 08:28:03
10707 02/01/2020 08:19:03
10708 31/12/2019 02:41:01
Name: Start time, Length: 10709, dtype: object
As I am working with time data I transform to datetime64[ns]
class all the timestamps from the column:
Input:
paradas["Start time"]=pd.to_datetime(paradas["Start time"])
print(paradas["Start time"])
Output:
0 2020-12-31 00:13:30
1 2020-12-30 19:30:00
2 2020-12-30 19:01:45
3 2020-12-30 19:00:10
4 2020-12-30 18:55:35
...
10704 2020-02-01 08:37:33
10705 2020-02-01 08:32:33
10706 2020-02-01 08:28:03
10707 2020-02-01 08:19:03
10708 2019-12-31 02:41:01
Name: Start time, Length: 10709, dtype: datetime64[ns]
Now, since the dates are resversed, I tried to put them backwards by using:
Input:
paradas["Start time"]=paradas["Start time"].sort_values(by=['Date'], ascending=False)
print(paradas["Start time"])
However it doesn't recognise my code because of the 'by':
Output:
TypeError Traceback (most recent call last)
<ipython-input-35-d4f349ab2092> in <module>()
126 #print(paradas["Start time"])
127
--> 128 paradas["Start time"]=paradas["Start time"].sort_values(by=['Date'], ascending=False)
129 print(paradas["Start time"])
130
TypeError: sort_values() got an unexpected keyword argument 'by'
Also, I tried to evaluate it without the arguments, but it doesn't change anything eitherway.
So I don't know what I'm doing wrong, if it's the type of the elements or what.
I read on another post about doing it with str
, but since I need the datetime format, and I've already sawn other codes evaluating this project with datetime64[ns]
, I am almost sure that it is possible...
Ok, I solved it.
To whom it may concern, the problem was that
paradas["Start time"]=paradas["Start time"].sort_values(by=['Date'], ascending=False)
wasn't right becausesort_values()
it's not prepared to operate by calling just one column from the DataFrame. In particular, my data (writen asparadas
) is inpandas.Dataframe
format, andparadas["Start time"]
(which is just one column from paradas) is inpandas.Seriers
format.We need to use
sort_values()
with theDataframe
format, so I needed to apply this comand to all my data, which means that we must useparadas
:Input:
Output:
Nevertheless, it didn't sort like I wanted, so I just realised that if I needed to sort backwards the "Start time" column, so I finally did:
Input:
Output:
(Now I've already changed into datetime64[ns] format) This is what it worked for me. I've just checked several times the documentation:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html
and compared it with:
https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.Series.sort_values.html