After a long struggle with my code I think I found a strange behavior of dcast() function in data.table library. Can anyone confirm it, or am I doing something wrong?
For the sake of example:
tt <- data.table(a=runif(n=300,min=0,max=1000000),
b=rep(paste("d",1:3,sep="",collapse=NULL),each=100),
c=rep(LETTERS[1:3],each=100))
t2 <- dcast(tt, c~b, fun.aggregate=sum, value.var = "a")
t2
# c d1 d2 d3
# 1: A 2531364379 0 0
# 2: B 0 2527589493 0
# 3: C 0 0 2532147262
Now, I would assume that numbers in t2
are exactly the same as in tt
. But they are not, since some garbage appears after decimal point. For example, in the third column:
t2$d3[3]-round(t2$d3[3],0)
# [1] 0.3269196
Use
options(digits=22)
(or some somewhat high number). This has nothing to do with how the number is stored, just how it is represented on the console.A reproducible example:
The better see the digits:
However, there is no problem with the underlying numbers. Regardless of the value of digits, it is still there.
The difference between what a number is versus how it is printed can be demonstrated thusly:
At no point did the real value of
pi
change, just how it is shown on the console.