I am currently working on social network data with R ergm package. I want to estimate the conditional probability of a tie who is homophilic on two different variables, but depending on how I specify the model the results are slightly different.
In the first case, I put two nodematch terms in my model, one for each variable that interest me, and I find the conditional log-odd of a doubly-homophilic tie by summing the 3 coefficients of my model (the "edge" terms and the two nodematch terms).
In the second case, I directly specify only one nodematch term, for ties homophilic on both variables.
And the results I get, though close, are still different, while in both cases I should get the log-odd of a tie occurring between individuals sharing both these attributes.
Here is an example from the Sampson data:
# Load the data :
library(statnet)
data(sampson)
#First model: I specify two nodematch terms, one for 'cloisterville' and one for 'group'.
m1 <- ergm(samplike ~ edges + nodematch('cloisterville') + nodematch('group'))
#Second model: this time, I have only one term asking for a `nodematch` on both terms at the same time.
m2 <- ergm(samplike ~ edges + nodematch(c('cloisterville','group')))
#Here is the output of both models:
summary(m1)
summary(m2)
So according to the first model, conditional log-odd of a homophilic tie on both variables should be:
-2.250 + 0.586 + 2.389
That is, 0.725
However, according to the second model, the log-odd of this same doubly homophilic tie should be:
-1.856 + 2.659
That is, 0.803
Corresponding probabilities are 0.6737071 and 0.6906158
Do you know why the results are different in both cases, whereas it should give the same conditional probability of the same kind of tie?
Thank you so much for your help,
Kind regards
Timothée
We should not expect the same results, since the models are evaluating two different things. In essence, model 1 is evaluating homophily on
cloistervilleor ongroup, while model 2 is evaluating homophily on bothcloistervilleandgroup.To be more precise, the first model tests homophily on
group, net the tendency toward homophily oncloisterville, and vice versa. The second model looks at whether there is a tendency toward homophily on both attributes at the same time. Do monks form ties within groups and based on their location in the cloisters?See the note in
?ergm.termsfornodematch:This is easy to see visually:
The colors are groups. Squares means
cloisterville==TRUEand triangles meanscloisterville==FALSE. The termnodematch(c('cloisterville','group'))counts only those edges where colors and shapes match!