System Crash Using pandas for datetimeindex

102 Views Asked by At

I'm trying to work my way through this RealPython tutorial on the Python visualization package called bokeh https://realpython.com/lessons/using-columndatasource-object/#transcript . This video version is 2.5 years old, and the text version that preceded it is 3.5 years old https://realpython.com/python-data-visualization-bokeh/#generating-your-first-figure . Needless to say, here and there neither work as expected due to changes in the affected programs.

At this particular point, I'm trying to put a datetimeindex on a data frame that's been created from some CSV data. The problem is that every time I've run this, once last night and twice this morning, the whole system freezes and I have to restart the computer. There is no error message or other output.

Python 3.9.12 bokeh 2.4.2 pandas 1.3.3 Ubuntu 20.04 LTS PyCharm 2020.1

Here’s the script:

import pandas as pd
from pandas import Timestamp
from datetime import datetime, date
import time

start = time.time()

player_stats = pd.read_csv('data/2017-18_playerBoxScore.csv', index_col='gmDate')

team_stats = pd.read_csv('data/2017-18_teamBoxScore.csv', index_col='gmDate')

standings = pd.read_csv('data/2017-18_standings.csv', index_col='stDate')

dti_p = pd.DatetimeIndex(player_stats.index)
dti_t = pd.DatetimeIndex(team_stats.index)
dti_s = pd.DatetimeIndex(standings.index)

player_stats.replace(to_replace=player_stats.index, value=dti_p, inplace=True)
team_stats.replace(to_replace=team_stats.index, value=dti_t, inplace=True)
standings.replace(to_replace=standings.index, value=dti_s, inplace=True)

end = time.time()
prgrm = start-end

print(f"start : {start}")

print(f"total time = {prgrm}")

print(f"end : {end}")

This version uses DatetimeIndex() and replace(), but previously I tried .apply() and Timestamp.date() without success. In writing this post, I ran the script again just to see what coverage and the debugger would give me before posting and now, for reasons I can’t explain, I am getting an error:

/home/malikarumi/.virtualenvs/chronicle/lib/python3.9/site-packages/pandas/core/indexing.py
    def _validate_key_length(self, key: Sequence[Any]) -> None:
line 791 → if len(key) > self.ndim:
            raise IndexingError("Too many indexers")

(<class 'pandas.core.indexing.IndexingError'>, IndexingError('Too many indexers'), <traceback object at 0x7fe8c8d45b80>)

Your assistance greatly appreciated.

1

There are 1 best solutions below

0
On

Does this give the desired result? (For player_stats)

player_stats = pd.read_csv('data/2017-18_playerBoxScore.csv')
player_stats['gmDate'] = pd.to_datetime(player_stats['gmDate'])
player_stats.set_index('gmDate', inplace=True)