I have a dataset of animals that were radio-tracked for 10 years. The columns I have are date, latitude, longitude, easting, northing and animal ID. Over the years, some coordinates were recorded wrong. As a result, some animal locations are recorded outside the study area (e.g., as wrong as fish on land). I want to identify those datapoints in my dataset and remove them without having to go over data rows individually. I came across the scrubr package but I was not very successful in getting it to work.
I have attached a dummy dataset here that deals with locations, but a similar logic applies (I think).
library(dplyr)
library(tidyverse)
library(sp)
library(ggmap)
library(adehabitatLT)
library(lubridate)
library(mapview)
Upload dummy dataset
starbucks =read.csv("https://raw.githubusercontent.com/libjohn/mapping-with-R/master/data/All_Starbucks_Locations_in_the_US_-_Map.csv")
Filter data
starbucksNC <- starbucks %>% filter(State == "NC")
Create the map
mapview(starbucksNC, xcol = "Longitude", ycol = "Latitude", crs = 4269, grid = FALSE)
#Describe study area boundary using four datapoints
-78.00149, 35.40030|
-77.90917, 35.40869|
-77.99672, 35.35299|
-77.89235, 35.35523|
From here, I am not sure how to remove the GPS coordinates that lie outside of that boundary, from the original dataset.
Here's how you'd use the sf package to define a bounding box and isolate the points within it:
Created on 2023-05-16 with reprex v2.0.2