Specify class for new columns in melt/gather

234 Views Asked by At

I'd like to specify the class of output columns in melt (or gather). And I would like to do it for all columns, and different classes.

For example, I have some data:

example <- data.frame(day = c(1, 2), max = c(20, 21), min = c(1, 2))

> example
  day max min
1   1  20   1
2   2  21   2

I melt those data

exmelt <- melt(example, id.vars = "day", variable.name = "minmax", value.name = "temp")

> exmelt
  day minmax temp
1   1    max   20
2   2    max   21
3   1    min    1
4   2    min    2

 > str(exmelt)
'data.frame':   4 obs. of  3 variables:
 $ day   : num  1 2 1 2
 $ minmax: Factor w/ 2 levels "max","min": 1 1 2 2
 $ temp  : num  20 21 1 2

Say I would like day to be class factor and temp to be class integer

I can do this after melting with as.factor()

exmelt$day <- as.factor(exmelt$day)
exmelt$temp <- as.integer(exmelt$temp)

> str(exmelt)
'data.frame':   4 obs. of  3 variables:
$ day   : Factor w/ 2 levels "1","2": 1 2 1 2
$ minmax: Factor w/ 2 levels "max","min": 1 1 2 2
$ temp  : int  20 21 1 2

To do this afterward for a complex data frame of many columns and different classes, some factors, some integers, etc., is going to be tedious and messy.

Is there w way to include this in melt? Like e.g.

 melt(example,
      id.vars = "day",
      variable.name = "minmax",
      value.name = "temp",
      colClasses = c("factor", "factor", "integer"))
1

There are 1 best solutions below

0
On BEST ANSWER

We can use the melt from data.table which also has the options variable.factor and value.factor. Other than that, the colClasses is not an argument in it.

dM <- melt(setDT(example), id.vars = "day", variable.name = "minmax",
           value.name = "temp", variable.factor=FALSE)

But, suppose if we need to do this in a single step, create a vector of functions and then apply it using Map and get

f1 <- c("as.factor", "as.factor", "as.integer")
dM[, names(dM) := Map(function(x,y) get(y)(x), .SD, f1)]
str(dM)
# Classes ‘data.table’ and 'data.frame':  4 obs. of  3 variables:
# $ day   : Factor w/ 2 levels "1","2": 1 2 1 2
# $ minmax: Factor w/ 2 levels "max","min": 1 1 2 2
# $ temp  : int  20 21 1 2