Description: trying to retrieve historical data from Investing.com using httr library
Original page: https://www.investing.com/rates-bonds/austria-1-year-bond-yield-historical-data
Expected output: html table with historical data: sample table output
Script logic:
- Send a
POSTquery withhttr - Prettify output of
read_htmlmethod withhtml_tablemethod
Issue:
- Script retrieves tables from the main page instead of the actual history table
Code:
library(httr)
url <- 'https://www.investing.com/instruments/HistoricalDataAjax'
# mimic XHR POST request implemented in the investing.com website
http_resp <- POST(url = url,
body = list(
curr_id = "23859",
smlID = "202274",
header = "Austria+1-Year+Bond+Yield+Historical+Data",
st_date = "08/01/2021", # MM/DD/YYYY format
end_date = "08/20/2021",
interval_sec = "Daily",
sort_col = "date",
sort_ord = "DESC",
action = "historical_data"
)
)
# parse the returned XML
html_doc <- read_html(http_resp)
print(html_table(html_doc)[[1]])
You might notice that the URL used in the R script uses a different URL https://www.investing.com/instruments/HistoricalDataAjax compared to the original web-page https://www.investing.com/rates-bonds/austria-1-year-bond-yield-historical-data. The reason for this is apparently the link used in the POST request when setting the start and end dates. You may see this on the screenshot below:
XHR request header when setting the start and end dates
From what I see, when a user specifies a date for a particular security, website sends a query to HistoricalDataAjax with parameters and identifiers of securities/assets specified in the body of the request: Example of the requests's body after selecting dates
You can get the table in,
https://www.investing.com/rates-bonds/austria-1-year-bond-yield-historical-data
using
rvest