How to merge two dataframes when column values are not exact?

123 Views Asked by At

I have:

  1. Linearly interpolated the dFe_env data every 1 m and create a data frame (This works)
  2. Extracted the 'Depth' (based on sinking rate) in 30 minute intervals (This works)
  3. Created a 'Time' column where it increases every 30 minutes (This works)

How do I:

  1. Merge two dataframes together (Bckgd_env2 and bulk_Fe2). In 'bulk_Fe2' the Depth increases by 1m and in 'Bckgd_env2' the depth increases by 0.8m. Can I get the closest 'Depth' match, extract the dFe_env at that depth and create a new data frame with Depth, Time and dFe_env all together?

    library(dplyr)
    
    Depth    <- c(0, 2, 20, 50, 100, 500, 800, 1000, 1200, 1500)
    dFe_env  <- c(0.2, 0.2, 0.3, 0.4, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1)
    bulk_Fe <- data.frame(Depth, dFe_env)
    
    summary(bulk_Fe)
    is.data.frame(bulk_Fe)
    
    do_interp <- function(dat, Depth = seq(0,1500, by=1)) {
    out <- tibble(Depth = Depth)
    for (var in c("dFe_env")) {
    out[[var]] <- tryCatch(approx(dat$Depth, dat[[var]], Depth)$y, method="ngb", error = function(e) NA_real_)
    }
    out
    }
    
    bulk_Fe2 <- bulk_Fe %>% do(do_interp(.))
    bulk_Fe2
    summary(bulk_Fe2)
    
    D0 <- 0 #Starting depth 
    T0 <- 0 #Starting time of the experiment
    
    r <- 40 #sinking rate per day
    
    r_30min <- r/48 #sinking speed every 30 minutes (There are 48 x 30 minute intervals in 24 hours)
    
    
    days <- round(1501/(r)) #days 1501 is maximum depth
    time <- days * 24 * 60 #minutes
    
    n_steps <- 1501/r_30min
    
    Bckgd_env2 <- data.frame(Depth =seq(from = D0, by= r_30min, length.out = n_steps + 1),
                      Time = seq(from = T0, by= 30, length.out = n_steps + 1))
    head(Bckgd_env2)
    round(Bckgd_env2, digits = 1)
    
    Bckgd_env3 <- merge(Bckgd_env2, bulk_Fe2)  
    Bckgd_env3
    
    plot(Bckgd_env2$dFe_env ~ Bckgd_env2$Depth, ylab="dFe (nmol/L)", xlab="Depth (m)", las=1)
    
1

There are 1 best solutions below

3
On

You have already built the mechanism for interpolation which will be useful for the join. But you didn't build it at the right depth values. It is just a matter of reorganizing your code.

Start with buiding Bckgd_env2, and only afterwards compute bulk_Fe2 and bulk_Fe3:

bulk_Fe2 <- bulk_Fe %>% do(do_interp(., Depth=Bckgd_env2$Depth))
Bckgd_env3 <- merge(Bckgd_env2, bulk_Fe2)