I'm trying to do some GIS work using R. Specifically, I have a spatialpointsdataframe (called 'points') and a spatiallinesdataframe (called 'lines). I want to know the closest line to each point. I do this:
# make a new field to hold the line ID
points@data$nearest_line <- as.character('')
# Loop through data. For each point, get ID of nearest line and store it
for (i in 1:nrow(points)){
points@data[i,"nearest_line"] <-
lines[which.min(gDistance(points[i,], lines, byid= TRUE)),]@data$line_id
}
This works fine. My issue is the size of my data. I've 4.5m points, and about 100,000 lines. It's been running for about a day so far, and has only done 200,000 of the 4.5m points (despite a fairly powerful computer).
Is there something I can do to speed this up? For example if I was doing this in PostGIS I would add a spatial index, but this doesn't seem to be an option in R.
Or maybe I'm approaching this totally wrong?