I have a spatial points data frame with characteristics of houses sold spanning through several years. I appended to it neighborhood attributes using "over" in {sp}. For each year of my housing data there is a neighborhood´s data set joined.
The problem: neighborhood data for different years don't always contain the same variables. Therefore, when joined to the housing data, I obtain NAs in these non-shared variables for houses sold in some particular years.
Ideal solution: for each row in my data, replace NAs with same column data (V1) from the same neighborhood (nb) but closest year available (y).
[,y] [,nb] [,V1]
[1,] 1993 30000 2752
[2,] 1993 30000 2752
[3,] 1994 30000 NA
[4,] 1994 50000 2554
[5,] 1995 30000 NA
[6,] 1996 30000 2650
[7,] 1996 50000 NA
Ideally, replace NAs such that [3,V1] = 2752
; [5,V1] = 2650
, and [7,V1] = 2554
. The data frame contains over 250k obs so looping through the whole thing is rather cumbersome.
You can use the function below for your purpose.
Now use the function to compute.
R can easily handle such a loop, but i would suggest the for loop in this case.
Seriously there are many people here saying "there aer 10min data r cant handle etc etc." R is not excel, R is created to handle the data