How to merge two dataframes when column values are not exact?

136 Views Asked by At

I have:

  1. Linearly interpolated the dFe_env data every 1 m and create a data frame (This works)
  2. Extracted the 'Depth' (based on sinking rate) in 30 minute intervals (This works)
  3. Created a 'Time' column where it increases every 30 minutes (This works)

How do I:

  1. Merge two dataframes together (Bckgd_env2 and bulk_Fe2). In 'bulk_Fe2' the Depth increases by 1m and in 'Bckgd_env2' the depth increases by 0.8m. Can I get the closest 'Depth' match, extract the dFe_env at that depth and create a new data frame with Depth, Time and dFe_env all together?

    library(dplyr)
    
    Depth    <- c(0, 2, 20, 50, 100, 500, 800, 1000, 1200, 1500)
    dFe_env  <- c(0.2, 0.2, 0.3, 0.4, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1)
    bulk_Fe <- data.frame(Depth, dFe_env)
    
    summary(bulk_Fe)
    is.data.frame(bulk_Fe)
    
    do_interp <- function(dat, Depth = seq(0,1500, by=1)) {
    out <- tibble(Depth = Depth)
    for (var in c("dFe_env")) {
    out[[var]] <- tryCatch(approx(dat$Depth, dat[[var]], Depth)$y, method="ngb", error = function(e) NA_real_)
    }
    out
    }
    
    bulk_Fe2 <- bulk_Fe %>% do(do_interp(.))
    bulk_Fe2
    summary(bulk_Fe2)
    
    D0 <- 0 #Starting depth 
    T0 <- 0 #Starting time of the experiment
    
    r <- 40 #sinking rate per day
    
    r_30min <- r/48 #sinking speed every 30 minutes (There are 48 x 30 minute intervals in 24 hours)
    
    
    days <- round(1501/(r)) #days 1501 is maximum depth
    time <- days * 24 * 60 #minutes
    
    n_steps <- 1501/r_30min
    
    Bckgd_env2 <- data.frame(Depth =seq(from = D0, by= r_30min, length.out = n_steps + 1),
                      Time = seq(from = T0, by= 30, length.out = n_steps + 1))
    head(Bckgd_env2)
    round(Bckgd_env2, digits = 1)
    
    Bckgd_env3 <- merge(Bckgd_env2, bulk_Fe2)  
    Bckgd_env3
    
    plot(Bckgd_env2$dFe_env ~ Bckgd_env2$Depth, ylab="dFe (nmol/L)", xlab="Depth (m)", las=1)
    
1

There are 1 best solutions below

3
Pierre Gramme On

You have already built the mechanism for interpolation which will be useful for the join. But you didn't build it at the right depth values. It is just a matter of reorganizing your code.

Start with buiding Bckgd_env2, and only afterwards compute bulk_Fe2 and bulk_Fe3:

bulk_Fe2 <- bulk_Fe %>% do(do_interp(., Depth=Bckgd_env2$Depth))
Bckgd_env3 <- merge(Bckgd_env2, bulk_Fe2)