I am looking for a suitable R package to analyse: i) Statistical differences in one or two response variables for different sample sites/seasons which is spatio-temporally correlated. ii) Separate the effects of various parameters on a response variable where several of the predictors are strongly correlated, but I suspect have significant individual effects.

Details:

I have a time series dataset (diurnal/seasonal) of a wide range of climatic/stream (Air temperature, water temperature, sunlight intensity, discharge) and subsurface (sediment gas evasion, sediment temperature, groundwater temperature, electrical conductivity) environmental parameters, trying to determine which factors determine Sediment gas evasion and sediment temperature. I suspect that temperature and organic content are the main drivers of gas evasion. But how do i separate Air temperature, Water temperature and radiation effects and determine the contribution of each to sediment temperature as air temperature determines water temperature and both radiation and air temperature impact sediment and water temperature. Also each parameter has a varying lag time effect depending on its intensity (from observation) and clearly the diurnal temperatures are related to each other and with sampling sites downstream of the other, they are likely spatially correlated also. So... i) How do I statistically prove differences in the response variables diurnal/Seasonal
ii) Determine the contribution of each predictor variable to my response parameter

Thanks in advance for your ideas !

1

There are 1 best solutions below

0
On

In my opinion your question is misplaced as it relates more to statistical modelling than R and its packages.

i) There is no way to statistically "prove" this. At best there might be strong indications. ii) there is to my knowledge no elegant and reliable way to do this. I know that for single dependent variables there is an R-package called relaimpo that provides one way to tackle such questions: https://cran.r-project.org/web/packages/relaimpo/index.html

What you try to solve sounds like a very tough problem, that requires a deep understanding of the methods used and the data at hand. Here is how I would approach the Problem: start out way simpler. On 1 single sampling site use e.g. correlations covariance matrices. Then move on to lagged covariance, GLMs, ... . Maybe check out canonical correlations. Maybe look at pca, ... . Most likely this will give you a lot of Information already. Ultimately, to really find out how each of the variables impacts any other variable you would need to perturb the system. E.g. Change the water temperature and observe the effect on all other variables.

If you really wanted to use advanced modelling techniques with latent variable spaces and internal states on this, then you could use something like a dynamic linear model. A tutorial on DLMs and state space modelling can be found here: http://helios.fmi.fi/~lainema/dlm/dlmtut.html. While the model in the tutorial has only one dependent variable, you can formulate the dependent timeseries as a Matrix and vectorize the parameters if necessary. Have a look at structural equation modelling as well.