I'm trying to run the lm() regression function of the d a demographic table below with information from a dataframe:
X HL.both NHL.B.both NHL.W.both HL.F NHL.B.F NHL.W.F HL.M NHL.B.M NHL.W.M HL.none NHL.B.none
1 n 12 14 409 16 20 569 58 53 1576 50 66
2 Age (yrs) (mean (SD)) 69.06 (2.75) 69.57 (3.52) 69.67 (3.97) 71.72 (4.48) 72.52 (5.67) 70.75 (4.57) 70.61 (4.27) 69.45 (4.12) 70.63 (4.32) 72.82 (4.85) 71.62 (5.28)
3 Sex = Female (%) 7 (58.3) 8 (57.1) 233 (57.0) 9 (56.2) 15 (75.0) 297 (52.2) 36 (62.1) 42 (79.2) 1062 (67.4) 32 (64.0) 43 (65.2)
4 Education (yrs) (mean (SD)) 16.50 (2.15) 17.57 (2.50) 16.74 (2.86) 15.25 (2.74) 15.75 (2.38) 17.03 (2.71) 16.03 (3.00) 16.08 (2.61) 16.55 (2.75) 15.52 (3.25) 16.23 (3.07)
5 APOEgen (%)
6 e2+ 1 (8.3) 1 (7.1) 44 (10.8) 0 (0.0) 6 (30.0) 57 (10.0) 3 (5.2) 8 (15.1) 129 (8.2) 4 (8.0) 16 (24.2)
7 e2e4 1 (8.3) 2 (14.3) 6 (1.5) 1 (6.2) 1 (5.0) 17 (3.0) 2 (3.4) 2 (3.8) 38 (2.4) 2 (4.0) 3 (4.5)
8 e3e3 8 (66.7) 8 (57.1) 183 (44.7) 8 (50.0) 9 (45.0) 288 (50.6) 35 (60.3) 24 (45.3) 821 (52.1) 36 (72.0) 29 (43.9)
9 e4+ 2 (16.7) 3 (21.4) 176 (43.0) 7 (43.8) 4 (20.0) 207 (36.4) 18 (31.0) 19 (35.8) 588 (37.3) 8 (16.0) 18 (27.3)
10 PET SUVr (mean (SD)) 1.04 (0.14) 1.06 (0.10) 1.12 (0.20) 1.11 (0.17) 1.03 (0.07) 1.09 (0.18) 1.09 (0.19) 1.06 (0.15) 1.10 (0.19) 1.10 (0.20) 1.03 (0.14)
11 Amyloid eligibility = positive (%) 3 (25.0) 4 (28.6) 152 (37.2) 5 (31.2) 1 (5.0) 157 (27.6) 13 (22.4) 12 (22.6) 496 (31.5) 17 (34.0) 14 (21.2)
12 MMSE (mean (SD)) 28.50 (1.51) 29.21 (1.12) 28.98 (1.14) 28.06 (1.98) 28.65 (1.35) 28.85 (1.17) 28.53 (1.27) 28.58 (1.49) 28.93 (1.14) 28.16 (1.36) 28.47 (1.19)
13 PACC (mean (SD)) -0.53 (2.48) -0.14 (2.18) 0.45 (2.36) -1.74 (2.99) -1.22 (3.09) 0.11 (2.52) -0.56 (2.56) -0.61 (2.66) 0.31 (2.50) -1.33 (2.38) -1.37 (2.29)
NHL.W.none p test
1 1309 NA
2 72.76 (4.96) <0.001 NA
3 714 (54.5) <0.001 NA
4 16.59 (2.95) <0.001 NA
5 <0.001 NA
6 178 (13.6) NA
7 36 (2.8) NA
8 784 (59.9) NA
9 311 (23.8) NA
10 1.09 (0.20) 0.007 NA
11 371 (28.3) 0.005 NA
12 28.75 (1.22) <0.001 NA
13 -0.22 (2.51) <0.001 NA
but everytime i try to run the lm(0 regression function on it, i get the error:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
>
i've tried transposing the table and to call upon the variables that way, but it still doesnt work.
variable e2. e2e4 e3e3 e4. age Amyloid.eligibility...positive.... Education..yrs...mean..SD.. MMSE..mean..SD.. n PACC..mean..SD..
1 HL.both 1 (8.3) 1 (8.3) 8 (66.7) 2 (16.7) 69.06 (2.75) 3 (25.0) 16.50 (2.15) 28.50 (1.51) 12 -0.53 (2.48)
2 NHL.B.both 1 (7.1) 2 (14.3) 8 (57.1) 3 (21.4) 69.57 (3.52) 4 (28.6) 17.57 (2.50) 29.21 (1.12) 14 -0.14 (2.18)
3 NHL.W.both 44 (10.8) 6 (1.5) 183 (44.7) 176 (43.0) 69.67 (3.97) 152 (37.2) 16.74 (2.86) 28.98 (1.14) 409 0.45 (2.36)
4 HL.F 0 (0.0) 1 (6.2) 8 (50.0) 7 (43.8) 71.72 (4.48) 5 (31.2) 15.25 (2.74) 28.06 (1.98) 16 -1.74 (2.99)
5 NHL.B.F 6 (30.0) 1 (5.0) 9 (45.0) 4 (20.0) 72.52 (5.67) 1 (5.0) 15.75 (2.38) 28.65 (1.35) 20 -1.22 (3.09)
6 NHL.W.F 57 (10.0) 17 (3.0) 288 (50.6) 207 (36.4) 70.75 (4.57) 157 (27.6) 17.03 (2.71) 28.85 (1.17) 569 0.11 (2.52)
7 HL.M 3 (5.2) 2 (3.4) 35 (60.3) 18 (31.0) 70.61 (4.27) 13 (22.4) 16.03 (3.00) 28.53 (1.27) 58 -0.56 (2.56)
8 NHL.B.M 8 (15.1) 2 (3.8) 24 (45.3) 19 (35.8) 69.45 (4.12) 12 (22.6) 16.08 (2.61) 28.58 (1.49) 53 -0.61 (2.66)
9 NHL.W.M 129 (8.2) 38 (2.4) 821 (52.1) 588 (37.3) 70.63 (4.32) 496 (31.5) 16.55 (2.75) 28.93 (1.14) 1576 0.31 (2.50)
10 HL.none 4 (8.0) 2 (4.0) 36 (72.0) 8 (16.0) 72.82 (4.85) 17 (34.0) 15.52 (3.25) 28.16 (1.36) 50 -1.33 (2.38)
11 NHL.B.none 16 (24.2) 3 (4.5) 29 (43.9) 18 (27.3) 71.62 (5.28) 14 (21.2) 16.23 (3.07) 28.47 (1.19) 66 -1.37 (2.29)
12 NHL.W.none 178 (13.6) 36 (2.8) 784 (59.9) 311 (23.8) 72.76 (4.96) 371 (28.3) 16.59 (2.95) 28.75 (1.22) 1309 -0.22 (2.51)
13 p <0.001 0.01 <0.001 <0.001 NA <0.001
PET.SUVr..mean..SD.. Sex...Female....
1 1.04 (0.14) 7 (58.3)
2 1.06 (0.10) 8 (57.1)
3 1.12 (0.20) 233 (57.0)
4 1.11 (0.17) 9 (56.2)
5 1.03 (0.07) 15 (75.0)
6 1.09 (0.18) 297 (52.2)
7 1.09 (0.19) 36 (62.1)
8 1.06 (0.15) 42 (79.2)
9 1.10 (0.19) 1062 (67.4)
10 1.10 (0.20) 32 (64.0)
11 1.03 (0.14) 43 (65.2)
12 1.09 (0.20) 714 (54.5)
13 0.01 <0.001
but that still returns the same error. I've also tried :
#Attempt 1: Trying to transpose the demog table to switch put the variable i want to compare across the x axis and the race categories on the y axis
require(data.table)
#transpose the demog table to be readbale by lm function
ERxFH_df_t <- dcast(melt(ERxFH_df, id.vars = "X"), variable ~ X)
colnames(ERxFH_df_t)
#cols.num <- c("HL.both","NHL.B.both","NHL.W.both","HL.F","NHL.B.F","NHL.W.F","HL.M","NHL.B.M N","HL.W.M","HL.none","NHL.B.none","NHL.W.none","p" )
#ERxFH_df[cols.num] <- sapply(ERxFH_df[cols.num],as.numeric)
#sapply(ERxFH_df, class)
#ERxFH_df %>% mutate_if(is.character,as.numeric) #ruined table :(
#rename columns:
names(ERxFH_df_t)[names(ERxFH_df_t) == "Age (yrs) (mean (SD))" ] <- "age"
names(ERxFH_df_t)[names(ERxFH_df_t) == "APOEgen (%)"] <- "APOEgen"
names(ERxFH_df_t)[names(ERxFH_df_t) == "PET.SUVr..mean..SD.."] <- "AB"
ERxFH_df_t
colnames(ERxFH_df_t)
ERxFH_df_t = ERxFH_df_t[-14,]
ERxFH_df_t
write.csv(ERxFH_df_t, "/Users/phe/Library/CloudStorage/OneDrive-UCLAITServices/UCLA/Deters lab/Family history and cog decline/ERxFH_df_t.csv", row.names=FALSE)
sapply(ERxFH_df_t, class)
#change to numbers in excel
ERxFH_df_t <- read.csv("/Users/phe/Library/CloudStorage/OneDrive-UCLAITServices/UCLA/Deters lab/Family history and cog decline/ERxFH_df_t_nopara.csv")
#ERxFH_df_t %>% mutate(across(where(is.character), as.numeric))
#^^ changed all my values to NA
ERxFH_df_t
#removing APOEgen column since it's empty
ERxFH_df_t <- subset(ERxFH_df_t, select = -APOEgen)
ERxFH_df_t <- na.omit(ERxFH_df_t)
ERxFH_df_t.lm <- lm(formula = age ~ AB,data = ERxFH_df_t)
#still doesnt work
ERxFH_df_t
And i still am met with the same error: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y' In addition: Warning message: In storage.mode(v) <- "double" : NAs introduced by coercion
I'm not sure what to try next!