r - Replace NAs with values according to two index vectors

138 Views Asked by At

I have a spatial points data frame with characteristics of houses sold spanning through several years. I appended to it neighborhood attributes using "over" in {sp}. For each year of my housing data there is a neighborhood´s data set joined.

The problem: neighborhood data for different years don't always contain the same variables. Therefore, when joined to the housing data, I obtain NAs in these non-shared variables for houses sold in some particular years.

Ideal solution: for each row in my data, replace NAs with same column data (V1) from the same neighborhood (nb) but closest year available (y).

      [,y]  [,nb] [,V1]
 [1,] 1993 30000 2752
 [2,] 1993 30000 2752
 [3,] 1994 30000 NA
 [4,] 1994 50000 2554
 [5,] 1995 30000 NA
 [6,] 1996 30000 2650
 [7,] 1996 50000 NA

Ideally, replace NAs such that [3,V1] = 2752; [5,V1] = 2650, and [7,V1] = 2554. The data frame contains over 250k obs so looping through the whole thing is rather cumbersome.

1

There are 1 best solutions below

9
On BEST ANSWER

You can use the function below for your purpose.

get_rid_of_NAs <- function(urmatrix) {
  myvector <- vector()
  counter <- 0
  myvector_1 <- vector()

  for(i in 1:nrow(urmatrix)){
   out <- urmatrix[i,2]
   out_1 <- urmatrix[i,1]
   myvector_1 <- c(myvector_1,out_1)
   myvector <- c(myvector,out)

   if(urmatrix[i,3]!=NA){
   next
   }      
   orders <- order(myvector[myvector==out],decreasing=TRUE)
   index <- which.min(myvector_1[orders])    
   urmatrix[i,3] <- urmatrix[index,3]
   }
 return(urmatrix)
}

Now use the function to compute.

           get_rid_of_NAs(ENTERYOURMATRIXHERE.)

R can easily handle such a loop, but i would suggest the for loop in this case.

Seriously there are many people here saying "there aer 10min data r cant handle etc etc." R is not excel, R is created to handle the data