R: Autokrige.cv function in automap package generates NaNs

625 Views Asked by At

I’m fairly new to R and I am trying to make interpolations of temperature measurements that where gathered from different station across the Netherlands. I have data for about 35 stations that make measurements every 10 minutes covering a timespan of about two weeks. Accordingly, I figured it would be best to make a loop that takes care of this. To see how well the interpolation technique works I want to do a cross validation for every timestamp.

In order to do this I used the Autokrige function from the automap package, and next I used the compare.cv function from the automap package in order to get an overview of the most important statistics for all time stamps. Besides that, I made sure the cross validation is only done if at least 25 stations registred meassurements.

The problem however is, that my code as described below works most of the time but gives the following warnings in 4 cases:

 1. In sqrt(ret[[var.name]]) : NaNs produced
 2. In sqrt(ret[[var.name]]) : NaNs produced
 3. In sqrt(ret[[var.name]]) : NaNs produced
 4. In sqrt(ret[[var.name]]) : NaNs produced

When I try to use the compare.cv command for the total list including all the cross validations it gives me the following error:

"Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm,  : 
  missing values and NaN's not allowed if 'na.rm' is FALSE"

Im wondering what causes the Autokrige function to generate NaNs in the cross validation, and more importantly how I can remove them from the results.cv so that I can use the compare.cv function?

rm(list=ls())

# load packages
require(sp)
require(gstat)
require(ggmap)
require(automap)
require(ggplot2)

#load data (download link provided below)
load("download path") https://www.dropbox.com/s/qmi3loub29e55io/meassurements_aug.RDS?dl=0

# make data spatial and assign spatial coordinate system
coordinates(meassurements) = ~x+y
proj4string(meassurements) <- CRS("+init=epsg:4326")
meassurements_df <- as.data.frame(meassurements)

# loop for cross validation
timestamp <- meassurements$import_log_id
results.cv=list()

for (i in unique(timestamp)) {  
  x = meassurements_df[which(meassurements$import_log_id == i), ]  
  if(sum(!is.na(x$temperature)) > 25){

    results.cv[[paste0(i)]] = autoKrige.cv (temperature ~ 1, meassurements[which(meassurements$import_log_id == i & !is.na(meassurements$temperature)), ])
  } 
}

# calculate key statistics (RMSE MAE etc)
compare.cv(results.cv) 

Thanks!

1

There are 1 best solutions below

0
On

I came across the same problem and solved it with the help of remove.duplicates() of package sp on the SpatialPointDataFrame used for kriging. Prior to that I calculated the mean of the relevant variables in the DataFrame.

    SPDF@data <- SPDF@data %>%
      group_by(varx,vary,varz) %>%
      mutate_at(vars(one_of(relevant_var)),mean,na.rm=TRUE) %>%
      ungroup()
    SPDF <- SPDF %>% remove.duplicates()

At the time I was encountering the same problem the Dropbox link above was not working anymore, so I could not check this specific example.