Computing the Everett-Valente Brokerage Score in TidyGraph

217 Views Asked by At

I want to compute the Everett-Valente Brokerage Score for each node in my directed network (Everett and Valente 2016). This score is based on betweenness centrality. Essentially, this controls for network size. The ability of a broker to control information/resource flows is moderated by network size and/or tie-redundancy. For an undirected graph, the Everett - Valente Brokerage Score is calculated thus:

  1. Compute node betweenness centrality.
  2. Double the computed betweenness centrality for each node and add (n - 1) to every non-pendant entry.
  3. Divide each non-zero score by the degree of the node.

I plan to use if_else statements to deal with non-pendant and zero scores e.g.

g <- g %>%
activate(nodes) %>%
mutate(betweenness = centrality_betweenness(),
       ev_brokerage = if_else(..if_else(..)..))

I do not know how to implement the ev_brokerage (the conditional statements). To extend this to the directed case, Everett and Valente (2016) provide the following rules:

For in-EV brokerage:

  1. Compute node betweenness centrality for v.
  2. If node betweenness centrality = 0 add j, where j = number of vertices that can reach v.
  3. Divide each non-zero sum by the in-degree of v.

For out-EV brokerage:

  1. Compute node betweenness centrality for v.
  2. If node betweenness centrality = 0 add k, where k = number of vertices that v can reach.
  3. Divide each non-zero sum by the out-degree of v.

EV brokerage of v = average of in-EV and out-EV.

If somebody can help me with the mutate() statement, I would be grateful. I would like to know how I can work out j and k in the directed case, and figure out non-pendant nodes in the undirected case.

2

There are 2 best solutions below

0
On BEST ANSWER

This would be a lot simpler to reason about (and generalize) if you just made it into a standalone function that calculated scores for an igraph object. Then it can be adapted to something tidygraph-friendly.

suppressPackageStartupMessages(library(tidygraph))
if_else <- dplyr::if_else
case_when <- dplyr::case_when
map2_dbl <- purrr::map2_dbl

It's fairly straightforward with undirected graphs as you don't need to nest any control flow.

create_notable("Zachary") %>% 
  mutate(pendant = centrality_degree() == 1,               # is a node a pendant? 
         btwn = centrality_betweenness()) %>%              # raw betweenness
  mutate(ev_step1 = if_else(pendant,                        # if it's a pendant...
                            btwn * 2,                          # double betweenness...
                            btwn * 2 + (graph_order() - 1)),   # else double it AND subtract n (nodes) - 1
         ev_brok = if_else(ev_step1 == 0,                   # if it's 0...
                           ev_step1,                        # leave it as is...
                           ev_step1 / centrality_degree())  # else divide it by raw degree
         ) %>% 
  select(ev_brok, btwn, pendant)

#> # A tbl_graph: 34 nodes and 78 edges
#> #
#> # An undirected simple graph with 1 component
#> #
#> # Node Data: 34 x 3 (active)
#>   ev_brok    btwn pendant
#>     <dbl>   <dbl> <lgl>  
#> 1   30.9  231.    FALSE  
#> 2   10.00  28.5   FALSE  
#> 3   18.5   75.9   FALSE  
#> 4    7.60   6.29  FALSE  
#> 5   11.2    0.333 FALSE  
#> 6   16.2   15.8   FALSE  
#> # ... with 28 more rows
#> #
#> # Edge Data: 78 x 2
#>    from    to
#>   <int> <int>
#> 1     1     2
#> 2     1     3
#> 3     1     4
#> # ... with 75 more rows

Here's an example directed graph...

(g <- matrix(c(1, 2,
              1, 3,
              3, 4, 
              4, 1,
              2, 5,
              5, 6,   # 6 is pendant with in-tie
              7, 2,   # 7 is pendant with out-ie
              4, 8,   # 8 is pendant with in-tie
              9, 10, 
              10, 11,
              11, 12, # 12 is a pendant with in-tie
              11, 13,
              9, 13),
            ncol = 2, byrow = TRUE) %>% 
  igraph::graph_from_edgelist()) %>% plot()

Rather than nesting ifelse()s inside each other, you can wrap them with dplyr::case_when() (but it still should go inside a proper function that can be tested and verified).

(
res <- g %>%
  as_tbl_graph() %>% 
  mutate(btwn = centrality_betweenness(),
         in_reach = local_size(order = graph_order(), mode = "in") - 1, # reach being max. ego graph order - 1 for ego
         out_reach = local_size(order = graph_order(), mode = "out") - 1,
         in_deg = centrality_degree(mode = "in"),
         out_deg = centrality_degree(mode = "out")) %>% 
  mutate(ev_in = case_when(
    btwn == 0 ~ if_else(btwn + in_reach == 0,       # if btwn is 0 and if btwn + in_reach is 0
                       btwn + in_reach,             # then btwn + in_reach (0)
                       (btwn + in_reach) / in_deg), # else add btwn and in_reach, then divide by in_deg
    btwn != 0 ~ btwn / in_deg
    )) %>% 
  mutate(ev_out = case_when(
    btwn == 0 ~ if_else(btwn + out_reach == 0, 
                        btwn + out_reach, 
                        (btwn + out_reach) / out_deg),
    btwn != 0 ~ btwn / out_deg
    )) %>% 
    mutate(ev_brok = map2_dbl(ev_in, ev_out, ~ mean(c(.x, .y)))) %>% 
  select(ev_brok, starts_with("ev_"), btwn, everything())
)
#> # A tbl_graph: 13 nodes and 13 edges
#> #
#> # A directed simple graph with 2 components
#> #
#> # Node Data: 13 x 8 (active)
#>   ev_brok ev_in ev_out  btwn in_reach out_reach in_deg out_deg
#>     <dbl> <dbl>  <dbl> <dbl>    <dbl>     <dbl>  <dbl>   <dbl>
#> 1    5.25     7    3.5     7        2         6      1       2
#> 2    6        4    8       8        4         2      2       1
#> 3    2        2    2       2        2         6      1       1
#> 4    4.5      6    3       6        2         6      1       2
#> 5    5        5    5       5        5         1      1       1
#> 6    3        6    0       0        6         0      1       0
#> # ... with 7 more rows
#> #
#> # Edge Data: 13 x 2
#>    from    to
#>   <int> <int>
#> 1     1     2
#> 2     1     3
#> 3     3     4
#> # ... with 10 more rows

Here's the full table to check the math:

res %>% as_tibble()

#> # A tibble: 13 x 8
#>    ev_brok ev_in ev_out  btwn in_reach out_reach in_deg out_deg
#>      <dbl> <dbl>  <dbl> <dbl>    <dbl>     <dbl>  <dbl>   <dbl>
#>  1    5.25   7      3.5     7        2         6      1       2
#>  2    6      4      8       8        4         2      2       1
#>  3    2      2      2       2        2         6      1       1
#>  4    4.5    6      3       6        2         6      1       2
#>  5    5      5      5       5        5         1      1       1
#>  6    3      6      0       0        6         0      1       0
#>  7    1.5    0      3       0        0         3      0       1
#>  8    1.5    3      0       0        3         0      1       0
#>  9    1      0      2       0        0         4      0       2
#> 10    2      2      2       2        1         3      1       1
#> 11    2.25   3      1.5     3        2         2      1       2
#> 12    1.5    3      0       0        3         0      1       0
#> 13    0.75   1.5    0       0        3         0      2       0
0
On

After checking against the campnet example presented in Everett and Valente (2016), the EV brokerage score for directed networks can calculated as thus:

g <- g %>%
  activate(nodes) %>%
  # compute in-degree, out-degree, and betweenness centrality 
  mutate(betweenness = centrality_betweenness(),
         in_degree = centrality_degree(mode = "in"),
         out_degree = centrality_degree(mode = "out"),
         in_reach = local_size(order = graph_order(), mode = "in") - 1,
         out_reach = local_size(order = graph_order(), mode = "out") - 1) %>%
  # compute everett-valente brokerage score
  mutate(ev_in = if_else(betweenness != 0, betweenness + in_reach, betweenness),
         ev_in = if_else(ev_in != 0, ev_in / in_degree, ev_in),
         ev_out = if_else(betweenness != 0, betweenness + out_reach, betweenness),
         ev_out = if_else(ev_out != 0, ev_out / out_degree, ev_out),
         ev_brokerage = (ev_in + ev_out) / 2) 

Using Granovetter's (1973) hypothetical undirected network presented in Everett and Valente (2016), the EV brokerage score can be calculated as thus:

edgelist <- data.frame(from = c(1,1,1,2,2,2,3,3,3,3,4,4,4,4,5,5,5,6,6,6,7,7,8,8,8,8,9,
                                     9,10,10,10,11,11,11,11,11,12,12,12,13,13,13,13,14,14,
                                     14,14,15,15,15,16,16,17,17,17,18,18,18,18,19,19,20,20,
                                     20,20,20,21,21,22,22,22,23,23,23,24,24,24,25,25,25,25),
                            to = c(2,3,24,1,3,4,1,2,4,5,2,3,5,6,3,4,6,5,5,7,6,8,9,10,11,
                                   14,8,10,9,8,11,10,8,12,14,13,11,14,13,11,12,14,15,8,11,
                                   12,13,13,16,17,15,17,15,16,18,17,19,20,21,18,20,19,18,
                                   21,25,22,18,20,20,25,23,24,25,22,1,25,23,24,23,22,20))

g <- igraph::graph_from_edgelist(as.matrix(edgelist), directed = F) %>% simplify()

g <- as_tbl_graph(g) %>%
  activate(nodes) %>%
  # compute brokerage
  mutate(betweenness = centrality_betweenness(),
        degree = centrality_degree(),
        ev_condition = if_else(betweenness != 0, betweenness * 2 + graph_order() - 1, betweenness),
     ev_brokerage = if_else(ev_condition != 0, ev_condition / degree, ev_condition))

data <- g %>% as.tibble()

I have not normalised the EV brokerage score as per Everett and Valente (2016).