Panda dataframe yield error

1.1k Views Asked by At

I am trying to yield 1 row by 1 row for a panda dataframe but get an error. The dataframe is a stock price data, including daily open, close, high, low price and volume information.

The following is my code. This class will get data from MySQL database

class HistoricMySQLDataHandler(DataHandler):

def __init__(self, events, symbol_list):
    """
    Initialises the historic data handler by requesting
    a list of symbols.

    Parameters:
    events - The Event Queue.
    symbol_list - A list of symbol strings.
    """
    self.events = events
    self.symbol_list = symbol_list

    self.symbol_data = {}
    self.latest_symbol_data = {}
    self.continue_backtest = True       

    self._connect_MySQL()
def _connect_MySQL(self): #get stock price for symbol s 
    db_host = 'localhost'
    db_user = 'sec_user'
    db_pass = 'XXX'
    db_name = 'securities_master'
    con = mdb.connect(db_host, db_user, db_pass, db_name)
    for s in self.symbol_list:
       sql="SELECT * FROM daily_price where symbol= s
       self.symbol_data[s] = pd.read_sql(sql, con=con, index_col='price_date')"

def _get_new_bar(self, symbol):
    """
    Returns the latest bar from the data feed as a tuple of 
    (sybmbol, datetime, open, low, high, close, volume).
    """

    for row in self.symbol_data[symbol].itertuples():
        yield tuple(symbol, datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S'), 
                    row[15], row[17], row[16], row[18],row[20])

def update_bars(self):
    """
    Pushes the latest bar to the latest_symbol_data structure
    for all symbols in the symbol list.
    """
    for s in self.symbol_list:
        try:
            bar = self._get_new_bar(s).__next__()
        except StopIteration:
            self.continue_backtest = False

In the main function:

# Declare the components with respective parameters
symbol_list=["GOOG"]
events=queue.Queue()
bars = HistoricMySQLDataHandler(events,symbol_list)


while True:
# Update the bars (specific backtest code, as opposed to live trading)
   if bars.continue_backtest == True:
      bars.update_bars()
   else:
      break
   time.sleep(1)

Data example:

symbol_data["GOOG"] =             
  price_date id exchange_id ticker instrument name   ...    high_price  low_price close_price adj_close_price  volume                              
  2014-03-27  29        None   GOOG      stock  Alphabet Inc Class C   ...      568.0000   552.9200      558.46          558.46    13100

The update_bars function will call _get_new_bar to move to next row (next day price)

My objective is to get stock price day by day (iterate rows of the dataframe) but self.symbol_data[s] in _connect_MySQL is a dataframe while in _get_new_bar is a generator hence I get this error

AttributeError: 'generator' object has no attribute 'itertuples'

Anyone have any ideas?

I am using python 3.6. Thanks

self.symbol_data is a dict, symbol is a string key to get the dataframe. the data is stock price data. For example self.symbol_data["GOOG"] return a dataframe with google's daily stock price information index by date, each row including open, low, high, close price and volume. My goal is to iterate this price data day by day using yield.

_connect_MySQL will get data from the database In this example, s = "GOOG" in the function

1

There are 1 best solutions below

0
Allen Huang On

I found the bug. My code in other place change the dataframe to be a generator. A stupid mistake lol

I didn't post this line in the question but this line change the datatype

# Reindex the dataframes
    for s in self.symbol_list:
        self.symbol_data[s] = self.symbol_data[s].reindex(index=comb_index, method='pad').iterrows()