I am new at Stan and I'm struggling to understand the difference in how different variable declaration styles are used. In particular, I am confused about when I should put square brackets after the variable type and when I should put them after the variable name. For example, given int<lower = 0> L; // length of my data
, let's consider:
real N[L]; // my variable
versus
vector[L] N; // my variable
From what I understand, both declare a variable N as a vector of length L.
Is the only difference between the two that the first way specifies the variable type? Can they be used interchangeably? Should they belong do different parts of the Stan code (e.g., data
vs parameters
or model
)?
Thanks for explaining!
real name[size]
andvector[size] name
can be used pretty interchangeably. They are stored differently internally, so you can get better performance with one or the other. Some operations might also be restricted to one and the other (e.g. vector multiplication) and the optimal order to loop over them changes. E.g. with amatrix
vs. a 2-D array, it is more efficient to loop over rows first vs. columns first, but those will come up if you have a more specific example. The way to read this is:means
name
is an array of typereal
, so a bunch ofreals
that are stored together.means that name is a vector of size
size
, which is also a bunch of reals stored together. But thevector
data type in STAN is based on theeigen
c++ library (c++) and that allows for other operations.You can also create arrays of vectors like this:
which is going to produce an array of
K
vectors of sizeN
.Bottom line: You can get any model running with using
vector
orreal
, but they're not necessarily equivalent in the computational efficiency.