Hello – I am trying to port a few Excel worksheets into a python script I can run on the fly. I am querying Bloomberg data, and essentially what I am looking to do is grab historical bid and ask prices for specific ISINs on specific dates. I am using the xbbg library to query the API and working within a Jupyter Notebook. The data is being stored in a Pandas dataframe.
This is the very simple Excel functionality I am trying to model:
Excel Functionality Screenshot
I have no problem pulling this data for individual line items. However, I am trying to run this for 25,000 index constituents/dates (in Excel you can just drag and drop down the formula). I have code that allows and returns the correct data, however it takes hours to run for this many lines. I believe it is because the code is waiting for each API call to resolve before going on to the next identifier.
I have been looking for solutions to this problem, namely using a library called asyncio. I do have code snippets that will work with the syntax, but it is still taking entirely too much time. Here is my code for the asyncio version to pull this data:
async def bbg_data(isin, start_date, end_date):
isin = '/isin/' + isin + '@BVAL'
previous_close_date = pd.to_datetime(end_date).strftime('%Y%m%d')
px_bid = blp.bdh(tickers=isin, flds=['PX_BID'], start_date=previous_close_date, end_date=previous_close_date)
if not px_bid.empty:
px_bid = px_bid.iloc[0][0]
else:
px_bid = "N/A"
px_ask = blp.bdh(tickers=isin, flds=['PX_ASK'], start_date=previous_close_date, end_date=previous_close_date)
if not px_ask.empty:
px_ask = px_ask.iloc[0][0]
else:
px_ask = "N/A"
return px_bid, px_ask
async def main():
tasks = [asyncio.create_task(bbg_data(isin, start_date, end_date)) for isin, start_date, end_date in ttd_sample[['ISIN', 'PreviousClose', 'PreviousClose']].values]
results = await asyncio.gather(*tasks)
ttd_sample['previous_bid_price'] = [result[0] for result in results]
ttd_sample['previous_ask_price'] = [result[1] for result in results]
await main()
A few questions that I have on this issue, hopefully with someone with BLPAPI/asyncio experience:
Is my reliance on using the xbbg library causing me too many headaches? Would I be able to solve this issue in one shot just using blpapi? If so, are there any good tutorials for a beginner programmer?
Is there an issue with how I wrote my asyncio code? I think I am close, but I am not sure if it is truly asynchronous.
Any advice on how to tackle this issue would be appreciated.
The setup costs of each
bdh()request are high. It could be quicker to get a range of dates, and then select the one you want. As an example, here there is only one call tobdh()rather than 10:With the output:
Notes: tickers can be repeated with different dates, by using
setto only pass distinct tickers to thebdh()call. Depending on the difference betweenmin_dateandmax_dateit may be more efficient to divide up a large number of tickers into smaller chunks with a tighter date range. 25,000 is probably too many to send in one call (you run the risk of the connection timing out), so perhaps start with blocks of 100 isin-date pairs.