Please bear with me as this is my first question here. I'm still trying to figure out how to post the data and the code that I already have, so for now I will just try to explain. If this is not the acceptable way of asking a question, please ignore the question and next time I will try to do it right.
I have a data frame that I want to do daily calculations on. For a specific day I already have OpenUnits, BuyUnits, SellUnits, CloseUnits and Interest.These values were calculate by another system. I need to proportion the daily interest base on the number of units sold. I can do the calculations, but I cannot figure out how to get the OpenInterest (previous day's close), without using a for loop on the data frame. The ClosingInterest should be OpenInterest + Interest - SellUnits/OpenUnits * OpenInterest
I tried using mutate(OpenInterest = lag(ClosingInterest), ClosingInterest = OpenInterest + Interest - SellUnits/OpenUnits * OpenInterest), but that dosn't seem to work.
I have the code working with a for loop, but I was hoping that there might be a better, and faster way of doing it.
Regards
library(tidyverse)
library(tibbletime)
library(lubridate)
sample <- list(OpenUnits = c(7500000, 7500000, 7500000, 7500000, 7500000,
3300000, 3300000, 3300000, 3300000, 3300000), ClosingUnits = c(7500000,
7500000, 7500000, 7500000, 3300000, 3300000, 3300000, 3300000,
3300000, 3300000), AccrualDate = 16892:16901, AiaAdjustAmt = c(1844.70359677349,
1845.18465061665, 1845.66582990696, 1846.14713467713, 812.516568582349,
812.728453146696, 812.940392965385, 813.152388052826, 813.364438423431,
813.576544091616), SellUnits = c(NA, NA, NA, NA, 4200000, NA,
NA, NA, NA, NA))
sample <- sample %>%
as_tibble() %>%
mutate(
AccrualDate = lubridate::as_date(AccrualDate),
SellUnits = if_else(is.na(SellUnits), 0, SellUnits)
) %>%
as_tbl_time(index = AccrualDate)
sample <- sample %>%
mutate(
RealInterest = 0,
OpenInterest = cumsum(AiaAdjustAmt) - cumsum(RealInterest) - AiaAdjustAmt - RealInterest,
RealInterest = OpenInterest*SellUnits/OpenUnits
)
This does not produce the correct answer.
# A time tibble: 10 x 7
# Index: AccrualDate
OpenUnits ClosingUnits AccrualDate AiaAdjustAmt SellUnits s24j_real s24j_open
<dbl> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7500000. 7500000. 2016-04-01 1845. 0. 0. 0.
2 7500000. 7500000. 2016-04-02 1845. 0. 0. 1845.
3 7500000. 7500000. 2016-04-03 1846. 0. 0. 3690.
4 7500000. 7500000. 2016-04-04 1846. 0. 0. 5536.
5 7500000. 3300000. 2016-04-05 813. 4200000. 4134. 7382.
6 3300000. 3300000. 2016-04-06 813. 0. 0. 8194.
7 3300000. 3300000. 2016-04-07 813. 0. 0. 9007.
8 3300000. 3300000. 2016-04-08 813. 0. 0. 9820.
9 3300000. 3300000. 2016-04-09 813. 0. 0. 10633.
10 3300000. 3300000. 2016-04-10 814. 0. 0. 11446.
The correct answer should look like this. This I achieved with a for loop, which I'm trying to avoid because it feels slow on the bigger data set that's also nested.
# A time tibble: 10 x 7
# Index: AccrualDate
OpenUnits ClosingUnits AccrualDate AiaAdjustAmt SellUnits s24j_real s24j_open
<dbl> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7500000. 7500000. 2016-04-01 1845. 0. 0. 0.
2 7500000. 7500000. 2016-04-02 1845. 0. 0. 1845.
3 7500000. 7500000. 2016-04-03 1846. 0. 0. 3690.
4 7500000. 7500000. 2016-04-04 1846. 0. 0. 5536.
5 7500000. 3300000. 2016-04-05 813. 4200000. 4134. 7382.
6 3300000. 3300000. 2016-04-06 813. 0. 0. 4060.
7 3300000. 3300000. 2016-04-07 813. 0. 0. 4873.
8 3300000. 3300000. 2016-04-08 813. 0. 0. 5686.
9 3300000. 3300000. 2016-04-09 813. 0. 0. 6499.
10 3300000. 3300000. 2016-04-10 814. 0. 0. 7313.
Code to produce the correct answer.
sample2 <- sample %>%
mutate(
sell_ratio = if_else(!is.na(SellUnits), SellUnits/OpenUnits, 0),
s24j_open = 0,
s24j_close = 0,
s24j_real = 0
)
open <- 0
close <- 0
for (i in seq_along(sample2$AccrualDate)) {
open <- close
sellratio <- sample2[i, ]$sell_ratio
int <- sample2[i, ]$AiaAdjustAmt
real <- sellratio*open
close <- open - real + int
sample2[i, ]$s24j_open <- open
sample2[i, ]$s24j_real <- real
sample2[i, ]$s24j_close <- close
}
sample2 %>%
select(
OpenUnits, ClosingUnits, AccrualDate, AiaAdjustAmt, SellUnits, s24j_real, s24j_open
)
Better late than never:
Here is the result:
However, recursive functions do not improve performance, quite the opposite (see this post). Although in the above code
myfct
is only applied to calculateclose
(real
andopen
are derived from it).At any rate, I believe the code could be modified in order to lose the recursiveness and/orsapply
- I will give that a try and update the code then.Edit
The problem with the first version of my code was that using both a recursive function and sapply makes the process very lengthy. Indeed,
myfct(n)
already calulatesmyfct(k)
for allk<n
, thus usingsapply
to calculate these values was redundant and inefficient.Here I left the old
function
andsapply
for completeness (newfunction
already edited in the above code):Finally, here is a performance comparison of the different methods: