I have an asset class return stream that is incomplete. What I have done in Frontline Solver is to generate a return distribution that matches the correlations of the asset class in question to other asset classes for the maximum amount of data that is available. My objective function is to minimize the RMSE between that correlation matrix and the simulated correlation matrix of the entire time horizon. Constraints include setting a risk and return (mean and sd) for the asset class that is being simulated and also some bounds around how many standard deviation each individual observation can be within.
I tried utilizing mvrnorm
however it also re-sampled the data I used to establish the covariance matrix which I do not want since I care about time dependency.
I started to research different optimization/solver packages such as lpSolve
and quadprog
but having difficulty interpreting.
Below is a data frame, since I can't have it be random to use in helping with this problem.
data<-structure(list(Class1 = c(8, 4, 5, -3, 1, 1, 5, 0, -3, 4, 3,
-1, 2, 7, -2, 2, 5, 4, -1, 9, 2, 0, -2, 2, 2, -7, 1, 3), Class2 = c(4,
6, 4, 0, 0, -1, 5, 2, 0, 0, 0, -1, 1, 1, -1, 1, 2, 2, 0, 3, 0,
0, -4, 0, 0, -4, -2, 0), Class3 = c(6, 7, 4, -2, 1, 1, 5, 0,
-2, 4, 2, -2, 1, 6, -2, 2, 4, 4, 0, 7, 2, 0, -2, 2, 2, -6, 2,
2), Class4 = c(9, 5, 7, 0, 1, 0, 7, -2, -2, 3, 0, -2, 3, 6, 0,
2, 5, 5, 0, 7, 3, -1, -5, 1, 2, -8, 2, 2), Class5 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, -3, 4, 4, 1, 2, 4, 4, -2,
4, 2, 0, -6, 1, 0, -9, 3, 4)), .Names = c("Class1", "Class2",
"Class3", "Class4", "Class5"), class = "data.frame", row.names = c(NA,-28L))
#Produces the following table
Class1 Class2 Class3 Class4 Class5
1 8 4 6 9 NA
2 4 6 7 5 NA
3 5 4 4 7 NA
4 -3 0 -2 0 NA
5 1 0 1 1 NA
6 1 -1 1 0 NA
7 5 5 5 7 NA
8 0 2 0 -2 NA
9 -3 0 -2 -2 NA
10 4 0 4 3 NA
11 3 0 2 0 NA
12 -1 -1 -2 -2 -3
13 2 1 1 3 4
14 7 1 6 6 4
15 -2 -1 -2 0 1
16 2 1 2 2 2
17 5 2 4 5 4
18 4 2 4 5 4
19 -1 0 0 0 -2
20 9 3 7 7 4
21 2 0 2 3 2
22 0 0 0 -1 0
23 -2 -4 -2 -5 -6
24 2 0 2 1 1
25 2 0 2 2 0
26 -7 -4 -6 -8 -9
27 1 -2 2 2 3
28 3 0 2 2 4
My goal is to get returns for Class5 for the entire 28 observations to match the correlation matrix of observations data[12:28,]. I also want to specify a custom mean and standard deviation of what the simulation should be. For example if you calulate the mean
and sd
for Class5 as it is now, you will get .76
and 3.8
respectively. However I want the new data to be lets say.. 1
and 5
.
Again, if you do mvrnorm
using a semi-custom mu
and sigma
then it will also re-simulate Class1-Class4.