Applying a box-Cox Transform to a Data Frame with a different Lambda for each column

2.9k Views Asked by At

I have a vector with the optimal lambda for the box cox transformation. This is my lambda vector:

lambda <- data.frame(lambda=c(0.01,0.01,0.01))
 str(lambda)
 #'data.frame':   3 obs. of  1 variable:
 #$ lambda: num  0.01 0.01 0.01

I also have data:

df:
obs1   obs2    obs3
34.3   232.2   2.0
2.1    56.3    90.0

etc...

Then I have a function that calculates the box-cox transformation for a column:

bc <- function (obs, lambda) {
(obs^lambda-1)/lambda }

I am trying to apply the function to my entire dataset as so:

 result <- as.data.frame(lapply(df, function(u) { lapply(lambda, function(v) { bc(u,v) } ) } ))

This doesn't seem to work since I tried to run it separately on one column like this:

d <- bc(df[,1],0.01)

Then I did:

z <-  result[,1] == d
table(z)

resulted in this:

FALSE    TRUE
3051     2

So basically, the two columns were not equal.

I am not sure why this is happening.

1

There are 1 best solutions below

1
On

Consider using mapply here so you can pass multiple arguments to your function of interest. A simple example - A function that requires two values, and we want to loop over each pair from two different vectors:

twoInputFun<-function(a,b){return(a^b)}
mapply(twoInputFun,c(1:3),c(5:7))
[1]    1   64 2187

So for your example,

df<-data.frame(x<-rnorm(10),y<-rnorm(10))
bc <- function (obs, lambda) {(obs^lambda-1)/lambda }
mapply(bc,df,c(5,10))

 [1,]     -1.1110238    -0.09995922
 [2,]     -0.1994398    -0.01196807
 [3,]     14.2708856    -0.09999996
 [4,]     -0.2195956    -0.09852122
 [5,]     -0.1996738     1.01595854
 [6,]     -0.2179283    -0.09999210
 [7,]     -0.6094362    -0.09999997
 [8,]     -0.1972702    -0.09999191
 [9,]      0.3886741    -0.10000000
[10,]     -0.2000037    25.83026050