Geographically weighted logistic regression different dimensions

427 Views Asked by At

I am trying to conduct a geographically weighted logistic regression to quantify the spatial variation in my data set. However upon running the gwr model, I get the error that my input data and coordinates have different dimensions.

This is the code I used for the boarder of the Netherlands:

unzip("ne_10m_admin_1_states_provinces.zip",exdir="NaturalEarth")
border <- shapefile("NaturalEarth/ne_10m_admin_1_states_provinces.shp")

#extract border netherlands
Netherlands1 <- border[paste(border$iso_a2)=="NL",]

My data has a binary outcome (0/1) regarding the prevalence of a pathogen.

Data_coord <- data[,c(1:2)] #extract coordinates 


sp.data <- SpatialPointsDataFrame(coords = data_coord, data = data_full3.p, 
                                        proj4string = CRS("+proj=longlat +datum=WGS84 +no_defs")) #convert to spatialpoint dataframe

Next I ran a logistic regression. There were no problems here.

m <- glm(glm(pathogen ~ Age_category,
             family=binomial(link='logit'),data=sp.data))
summary(m)

I transformed the data in a spatial* object with a planar CRS

alb <- CRS("+proj=utm +zone=31N +datum=WGS84")
sp <- sp.data

spt <- spTransform(sp, alb)
ctst <- spTransform(Netherlands1, alb)

#get optimal bandwidth 
bw <- gwr.sel(A._phagocytophilum_qPCR1 ~ Age_Category, data=spt)
bw

But as soon as I run this line I get an error

#run gwr function 
g <- gwr(pathogen ~ Age_Category, data=spt, bandwidth=bw, fit.points=newpts[, 1:2])

Error in gwr(pathogen ~ Age_Category, data = spt, bandwidth = bw, : Input data and coordinates have different dimensions

Would anyone know how to solve this? Thank you in advance!

1

There are 1 best solutions below

0
On

Looking at the function code, it looks like the issue is here:

if (NROW(x) != NROW(coords)) 
stop("Input data and coordinates have different dimensions") 

x here is taken from a model.frame object created earlier in the code, which will automatically exclude any rows that have NA or NaN values, thus potentially providing a different number of rows than the dataset.

I wasn't able to unzip your file, but I would check to see if there are any missing values. I was having this same issue with ggwr, and removing those values resolved the problem.