I am working on getting my PageRank values from igraph in R to match those I get from Gephi. I have followed this example: https://www.briggsby.com/personalized-pagerank and my igraph values match the weighted values this example has. But Gephi produces a different value for weighted PageRank and I'm unsure why. When I run this as an unweighted PageRank, I get the same results between igraph and Gephi.
The network I'm importing is simple to get the math correct -
| Source | Target | Weight |
|---|---|---|
| A | B | 1.0 |
| B | C | 1.0 |
| C | B | 1.0 |
| C | A | 0.5 |
| A | C | 1.0 |
| C | D | 0.1 |
| D | A | 0.5 |
The code I'm using is as follows:
library(igraph);
library(plyr);
set.seed(123);
mydf <- data.frame(from=TestPageRank$Source, to=TestPageRank$Target);
mygraph <- graph.data.frame(mydf, directed = T);
c<-data.frame(users=V(mygraph)$name, page_rank = page_rank(mygraph, directed = T, damping = 0.85, weights = TestPageRank$Weight)$vector, degree=degree(mygraph));
The PageRanks I'm returning are as follows:
| Node | igraph Weighted PageRank | Gephi Weighted PageRank |
|---|---|---|
| A | 0.1960 | 0.2373 |
| B | 0.3373 | 0.2761 |
| C | 0.4075 | 0.3732 |
| D | 0.0591 | 0.1133 |
In this example, the ranking is at least the same, but when I apply this to my larger networks with thousands of nodes, the node ranking by PageRank is very different. Any thoughts on why this might be? Or how I can modify my R code to match the Gephi PageRank values?
Here's the updated code with import:
df <- structure(list(Source = c("A", "B", "C", "C", "A", "C", "D"),
Target = c("B", "C", "B", "A", "C", "D", "A"),
Weight = c(1,1, 1, 0.5, 1, 0.1, 0.5)),
class = "data.frame", row.names = c(NA, -7L))
g <- graph_from_data_frame(df)
page_rank(g, weights = E(g)$Weight, directed = T, damping = 0.85)$vector
degree(g)
And the output from the above:
A B C D
0.19602465 0.33730560 0.40752024 0.05914951
I am not able to reproduce your results with igraph. Please provide a minimal reproducible example, with copyable code. You will find guidance here.
Here is your datafile as copyable CSV:
We get this after using
read.csv:Using the ARPACK method, which is an entirely distinct algorithm, we get the same:
These numbers differ from what you quote, but I cannot tell why without a reproducible example.
I should note that I worked on igraph's PageRank code and I believe that it is exceedingly unlikely that it would give incorrect results.