Can not merge in DT as I got Coerced double RHS to logical to

72 Views Asked by At

I got this error while merging 1 column from 1 df called data.all to the my working dfcalled data

setDT(data)[setDT(data.all), RX_HOSP_SURG_APPR_2010 := i.RX_HOSP_SURG_APPR_2010, on=c("PUF_CASE_ID","SR_ID" )]

Warning message: In [.data.table(setDT(data), setDT(data.all), :=(RX_HOSP_SURG_APPR_2010, : Coerced double RHS to logical to match the type of the target column (column 157 named 'RX_HOSP_SURG_APPR_2010'). If the target column's type logical is correct, it's best for efficiency to avoid the coercion and create the RHS as type logical. To achieve that consider R's type postfix: typeof(0L) vs typeof(0), and typeof(NA) vs typeof(NA_integer_) vs typeof(NA_real_). You can wrap the RHS with as.logical() to avoid this warning, but that will still perform the coercion. If the target column's type is not correct, it's best to revisit where the DT was created and fix the column type there; e.g., by using colClasses= in fread(). Otherwise, you can change the column type now by plonking a new column (of the desired type) over the top of it; e.g. DT[, RX_HOSP_SURG_APPR_2010:=as.double(RX_HOSP_SURG_APPR_2010)]. If the RHS of := has nrow(DT) elements then the assignment is called a column plonk and is the way to change a column's type. Column types can be observed with sapply(DT,typeof) [... truncated]

I tried different ways but I could not figure this out

str(data$RX_HOSP_SURG_APPR_2010)

logi [1:8671] FALSE FALSE FALSE NA NA NA ...

str(data.all$RX_HOSP_SURG_APPR_2010)

'haven_labelled' num [1:129296] 0 0 NA NA NA NA NA NA NA NA ... - attr(, "label")= chr "Surgical Approach at this Facility 2010 and Later" - attr(, "format.spss")= chr "F1.0" - attr(, "display_width")= int 23 - attr(, "labels")= Named num [1:7] 0 1 2 3 4 5 9 ..- attr(*, "names")= chr [1:7] "No surgical procedure of primary site" "Robotic assisted" "Robotic converted to open" "Laparoscopic" ...

Any advice will be appreciated.

1

There are 1 best solutions below

0
On

You could share dput(head(data)) and dput(head(data.all)) of your "gigantic" data. Please improve your question.

In order to assing on the fly during the join, you need both column classes to be the same and, as you noticed, your variable in data is logical (probably because at the time you read it from a file it only had zeros and NAs) while your variable in data.all is a weird class.

You can try to assign the class first with:

class(data$RX_HOSP_SURG_APPR_2010) <- class(data.all$RX_HOSP_SURG_APPR_2010)