I want to import daily stock market price data into R from any ticker, and examine one historical time segment of it. Then, from this segment, convert these prices into daily ROC/rateofchange % changes. Next, take this ROC series and create a cumulative probability density function which allows me to set any custom number of sorting bins, and any size limit for each bin. example: 22 bins with .3% limit. Next, plot this CPDF as either a histogram or a scatterplot. The final step would be to do this for 2 different sections of the same stock and plot them next to each other for visual inspection. I have started a code on stock ticker SPY, but I cannot get it to work.
library(quantmod)
library(tidyquant)
library(tidyverse)
# using tidyverse to import a ticker
spy <- tq_get("spy")
spy010422 <- tq_get("spy", get ="stock.prices", from ='2022-01-04', to = '2022-01-24')
str(spy010422)
# getting ROC between prices in the series
spy010422.rtn = ROC(spy010422$close, n = 1, type = c("discrete"), na.pad = TRUE)
str(spy010422.rtn)
# trying to use ggplot and tibble to create an ECDF function
spy010422.rtn %>%
tibble() %>%
ggplot() +
stat_ecdf(aes(.))
# another attempt at running ECDF on the ROC series
spy010422.rtn %>%
ggplot(spy010422.rtn) +
stat_ecdf(aes(close))
# trying to set the number of bins and bin size for the ECDF
spy010422.rtn %>%
mutate(rounded = round(close/.3, 0) *.3,
bin = min_rank(rounded)) %>%
ggplot(aes(close, bin)) +
geom_line()
# next time segment of the ticker spy to compare this to
spy020222 <- tq_get("spy", get ="stock.prices", from ='2022-02-02', to = '2022-02-24')
I couldn't understand what exacly you wanted to plot. Normally a CPDF is just a continuous line, and doesn't have bins to customise. Also "plot this CPDF as either a histogram or a scatterplot" is a weird prhase to me, as one normally plots the histogram/scatterplot of the variable, not of the CPDF of the variable. Given that, I made a function that plots the histogram of the ROC of the ticker, and you can coment if that was what you wanted or not.
The function takes a list of dates in the format
list(c(from1, to1), c(from2, to1), ...)
(you can add as many intervals as you want), and loops for each interval on this list (with thepurrr::map
function). For each interation, it creates the histogram costumizing thebins
argument. After the loop, the graphs are binded in one figure using theggpubr::ggarrange
function (you must runinstall.packages("ggpubr")
if you don't have the package installed).Runnig:
Yields this graph: