I have a matrix of real detailed routes which I want to efficiently turn into a simple spatial network. Simple means that I don't care about the intricacies of local transport and possible intersections of routes near the start and end points. I do want to add major intersections outside the start and end points as nodes to the network. Below I give a simple example. My real data has 12,500 routes between 500 start and end points and is approx. 2Gb in size.
library(fastverse)
#> -- Attaching packages --------------------------------------- fastverse 0.3.2 --
#> v data.table 1.15.0 v kit 0.0.13
#> v magrittr 2.0.3 v collapse 2.0.12
fastverse_extend(osrm, sf, sfnetworks, install = TRUE)
#> -- Attaching extension packages ----------------------------- fastverse 0.3.2 --
#> v osrm 4.1.1 v sfnetworks 0.6.3
#> v sf 1.0.16
largest_20_german_cities <- data.frame(
city = c("Berlin", "Stuttgart", "Munich", "Hamburg", "Cologne", "Frankfurt",
"Duesseldorf", "Leipzig", "Dortmund", "Essen", "Bremen", "Dresden",
"Hannover", "Nuremberg", "Duisburg", "Bochum", "Wuppertal", "Bielefeld", "Bonn", "Muenster"),
lon = c(13.405, 9.18, 11.575, 10, 6.9528, 8.6822, 6.7833, 12.375, 7.4653, 7.0131,
8.8072, 13.74, 9.7167, 11.0775, 6.7625, 7.2158, 7.1833, 8.5347, 7.1, 7.6256),
lat = c(52.52, 48.7775, 48.1375, 53.55, 50.9364, 50.1106, 51.2333, 51.34, 51.5139,
51.4508, 53.0758, 51.05, 52.3667, 49.4539, 51.4347, 51.4819, 51.2667, 52.0211, 50.7333, 51.9625))
# Unique routes
m <- matrix(1, 20, 20)
diag(m) <- NA
m[upper.tri(m)] <- NA
routes_ind <- which(!is.na(m), arr.ind = TRUE)
rm(m)
# Routes DF
routes <- data.table(from_city = largest_20_german_cities$city[routes_ind[, 1]],
to_city = largest_20_german_cities$city[routes_ind[, 2]],
duration = NA_real_,
distance = NA_real_,
geometry = list())
# Fetch Routes
i = 1L
for (r in mrtl(routes_ind)) {
route <- osrmRoute(ss(largest_20_german_cities, r[1], c("lon", "lat")),
ss(largest_20_german_cities, r[2], c("lon", "lat")), overview = "full")
set(routes, i, 3:5, fselect(route, duration, distance, geometry))
i <- i + 1L
}
routes %<>% st_as_sf(crs = st_crs(route))
routes_net = as_sfnetwork(routes, directed = FALSE)
print(routes_net)
#> # A sfnetwork with 20 nodes and 190 edges
#> #
#> # CRS: EPSG:4326
#> #
#> # An undirected simple graph with 1 component with spatially explicit edges
#> #
#> # A tibble: 20 × 1
#> geometry
#> <POINT [°]>
#> 1 (9.179999 48.7775)
#> 2 (13.405 52.52)
#> 3 (11.57486 48.13675)
#> 4 (10.00001 53.54996)
#> 5 (6.95285 50.9364)
#> 6 (8.68202 50.1109)
#> # ℹ 14 more rows
#> #
#> # A tibble: 190 × 7
#> from to from_city to_city duration distance geometry
#> <int> <int> <chr> <chr> <dbl> <dbl> <LINESTRING [°]>
#> 1 1 2 Stuttgart Berlin 390. 633. (9.179999 48.7775, 9.18005 48…
#> 2 2 3 Munich Berlin 356. 586. (11.57486 48.13675, 11.57486 …
#> 3 2 4 Hamburg Berlin 176. 288. (10.00001 53.54996, 10.0002 5…
#> # ℹ 187 more rows
plot(routes_net)

Created on 2024-03-28 with reprex v2.0.2
Regarding possible solutions I am open to any software (R, Python, QGIS etc.). I know in R there is tidygraph which allows me to do something like
library(tidygraph)
routes_net_subdiv = convert(routes_net, to_spatial_subdivision)
But this seems to run forever even with this mock example. I have also seen ideas to use GRASS's v.clean tool to break up the geometry, but haven't tried that yet and a bit reluctant to install GRASS.
I think perhaps the best solution for performance is converting to S2 and comparing all linestrings individually using s2_intersection() and then turning this information into a graph somehow. But hoping for more elegant and performant solutions.
With python and IIUC, you can use this primal approach after getting the available
directions:NB: You'll need to signup to openrouteservice in order to generate/get your api key.
Now, to make the graph, you can use
momepy:Plot (optional) :
Used imports/input :