I'm very new to R and struggling to perform differential analysis on a published dataset (I cannot use a package like DESEQ2 or edgeR, it must be by hand). My dataset is 2 different biotypes of aphids (1 vs 4) feeding off 2 different subsets of soybeans (R vs S), with 3 replicates of each treatment combination (12 total). I believe a glm.nb would be a good fit since my dataset seems to contain overdispersion. I don't need to identify specific genes, I just want to compare the effect of the biotype on expression between R and S. This is a snippet of the layout of the data I have and the code. My results seem very off from everything I've seen online for glm.nb, and I'm concerned I've made a mistake early on (maybe in my choice of data organization).
library(ggplot2)
library(dplyr)
library(purrr)
library(MASS)
library(pscl)
library(sjPlot)
library(lmtest)
Aphid <- read.csv("Aphid.csv")
Aphid$Genes <- as.factor(Aphid$Genes)
head(Aphid)
str(Aphid)
Aphid[, -1] <- lapply(Aphid[, -1], as.integer)
head(Aphid)
str(Aphid)
summary(Aphid$X1S1.Exp)
var(Aphid$X1S1.Exp)
m1 <- glm.nb(Aphid$X1S1.Exp ~ Aphid$Genes, control = glm.control(maxit = 1000))
summary(m1)
Results from m1 (https://i.stack.imgur.com/ipbRw.png) Layout in excel - rows continue to 1000 (https://i.stack.imgur.com/kgqo6.png)
I've tried running the code like this without the maxit and receive this error.
m1 <- glm.nb(Aphid$X1S1.Exp ~ Aphid$Genes)
There were 29 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
3: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
4: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
5: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
6: glm.fit: algorithm did not converge
7: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
8: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
9: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
10: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
11: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
12: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
13: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
14: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
15: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
16: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
17: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
18: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
19: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
20: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
21: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
22: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
23: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
24: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
25: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
26: glm.fit: algorithm did not converge
27: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
28: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
29: In glm.nb(Aphid$X1S1.Exp ~ Aphid$Genes) : alternation limit reached