I am interested in extracting the cell values alongside their corresponding x and y coordinates from a tif file accessible from the WorldPop database [ https://hub.worldpop.org/geodata/summary?id=49920 ].
I have converted this file alongside other tif files available on this website into rasters and then used the rasterToPoints function in R to extract this information. However, although this approach has worked for most of the files, it has failed for this particular file amongst a few others. It's like the R session remains stuck and the code just never runs when I attempt to convert the rasters to spdf data.
library(raster)
Raster <- raster("C:/file path/aus_ppp_2020_UNadj_constrained.tif")
Raster <- rasterToPoints(Raster, spatial = TRUE)
As an alternative, I thought I could extract the coordinates after obtaining the cell values using getValues() or readAll() but due to the size of the raster being too large I run into the following error:
Error: cannot allocate vector of size 17.8 Gb.
sessionInfo()
# R version 4.2.0 (2022-04-22 ucrt)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 10 x64 (build 22000)
library(memuse)
memuse::Sys.meminfo()
# Totalram: 31.781 GiB
# Freeram: 26.164 GiB
I then tried to see if I could modify the usable memory using memory.limit() but this code has been retired from R version 4.2 and I cannot find an alternative.
memory.limit()
# Warning: 'memory.limit()' is no longer supported[1] Inf
I was wondering if anyone knows:
1. If there is a way I could get the rasterToPoints function to work for this raster.
2. If there is a way to subset the raster to smaller rasters, while retaining all data, so that I could use the rasterToPoints function on each subset and then merge the resulting spatial point dataframes.
3. If there is an alternative way to extract the coordinates alongside the cell values for this tif file.
Any help is greatly appreciated.
Although it seems like an overall questionable idea to break an efficient storage apart in general - it's not like the relation between values and coordinates is lost in a raster - there is a pretty convenient way to do so using
terra
.However, this case seems to be an exception to the general rule since your raster is huge - consisting of 55,075 rows and 74,928 columns - with only 2,910,517 values being not
NA
according toglobal(r, "notNA")
. To put it in other words, 99.9 % of the values of your raster areNA
, so I get your point trying to convert this.Unfortunately, my first attempt had some flaws as the
as.polygons(r) |> as.points() |> as.data.frame(geom = "XY")
pipe only returned a result because of thedissolve = TRUE
default reducing the number of objects. Moreover, the output consisted of the nodes of the polygons dissolved, so I'd really like to correct my answer making use ofmakeTiles()
what seems to be an appropriate way when dealing with huge raster data and facingstd::bad_alloc
errors.Read your raster
r
and create a second rastert
which is used for tiling using the extent and crs ofr
but with significantly higher resolution:As a result, you'll find 14x19 tiles - this process takes some minutes - named
tile_*.tif
in your working directory. Maybe there is a more elegant way to solve this without writing the subsets of your raster to disk, but on the other hand, you want to prevent RAM overflow, so writing to disk (temporarily) might be quite appropriate. And maybe there is also no necessity to split your raster in 266 tiles - one could check for the critical limit of values per raster additionally, I just used a random value per trial and error - to speed this up a little bit. However, if time is not critical, this workflow should at least produce proper results.Since
result
consists of 2,910,517 rows, I'd estimate this approach as robust.