I need help running the lm() regression on a TableOne demographics table

18 Views Asked by At

I'm trying to run the lm() regression function of the d a demographic table below with information from a dataframe:

X      HL.both   NHL.B.both   NHL.W.both         HL.F      NHL.B.F      NHL.W.F         HL.M      NHL.B.M      NHL.W.M      HL.none   NHL.B.none
1                                   n           12           14          409           16           20          569           58           53         1576           50           66
2               Age (yrs) (mean (SD)) 69.06 (2.75) 69.57 (3.52) 69.67 (3.97) 71.72 (4.48) 72.52 (5.67) 70.75 (4.57) 70.61 (4.27) 69.45 (4.12) 70.63 (4.32) 72.82 (4.85) 71.62 (5.28)
3                    Sex = Female (%)     7 (58.3)     8 (57.1)   233 (57.0)     9 (56.2)    15 (75.0)   297 (52.2)    36 (62.1)    42 (79.2)  1062 (67.4)    32 (64.0)    43 (65.2)
4         Education (yrs) (mean (SD)) 16.50 (2.15) 17.57 (2.50) 16.74 (2.86) 15.25 (2.74) 15.75 (2.38) 17.03 (2.71) 16.03 (3.00) 16.08 (2.61) 16.55 (2.75) 15.52 (3.25) 16.23 (3.07)
5                         APOEgen (%)                                                                                                                                               
6                                 e2+      1 (8.3)      1 (7.1)    44 (10.8)      0 (0.0)     6 (30.0)    57 (10.0)      3 (5.2)     8 (15.1)    129 (8.2)      4 (8.0)    16 (24.2)
7                                e2e4      1 (8.3)     2 (14.3)      6 (1.5)      1 (6.2)      1 (5.0)     17 (3.0)      2 (3.4)      2 (3.8)     38 (2.4)      2 (4.0)      3 (4.5)
8                                e3e3     8 (66.7)     8 (57.1)   183 (44.7)     8 (50.0)     9 (45.0)   288 (50.6)    35 (60.3)    24 (45.3)   821 (52.1)    36 (72.0)    29 (43.9)
9                                 e4+     2 (16.7)     3 (21.4)   176 (43.0)     7 (43.8)     4 (20.0)   207 (36.4)    18 (31.0)    19 (35.8)   588 (37.3)     8 (16.0)    18 (27.3)
10               PET SUVr (mean (SD))  1.04 (0.14)  1.06 (0.10)  1.12 (0.20)  1.11 (0.17)  1.03 (0.07)  1.09 (0.18)  1.09 (0.19)  1.06 (0.15)  1.10 (0.19)  1.10 (0.20)  1.03 (0.14)
11 Amyloid eligibility = positive (%)     3 (25.0)     4 (28.6)   152 (37.2)     5 (31.2)      1 (5.0)   157 (27.6)    13 (22.4)    12 (22.6)   496 (31.5)    17 (34.0)    14 (21.2)
12                   MMSE (mean (SD)) 28.50 (1.51) 29.21 (1.12) 28.98 (1.14) 28.06 (1.98) 28.65 (1.35) 28.85 (1.17) 28.53 (1.27) 28.58 (1.49) 28.93 (1.14) 28.16 (1.36) 28.47 (1.19)
13                   PACC (mean (SD)) -0.53 (2.48) -0.14 (2.18)  0.45 (2.36) -1.74 (2.99) -1.22 (3.09)  0.11 (2.52) -0.56 (2.56) -0.61 (2.66)  0.31 (2.50) -1.33 (2.38) -1.37 (2.29)
     NHL.W.none      p test
1          1309          NA
2  72.76 (4.96) <0.001   NA
3    714 (54.5) <0.001   NA
4  16.59 (2.95) <0.001   NA
5               <0.001   NA
6    178 (13.6)          NA
7      36 (2.8)          NA
8    784 (59.9)          NA
9    311 (23.8)          NA
10  1.09 (0.20)  0.007   NA
11   371 (28.3)  0.005   NA
12 28.75 (1.22) <0.001   NA
13 -0.22 (2.51) <0.001   NA

but everytime i try to run the lm(0 regression function on it, i get the error:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
> 

i've tried transposing the table and to call upon the variables that way, but it still doesnt work.

     variable        e2.     e2e4       e3e3        e4.          age Amyloid.eligibility...positive.... Education..yrs...mean..SD.. MMSE..mean..SD..    n PACC..mean..SD..
1     HL.both    1 (8.3)  1 (8.3)   8 (66.7)   2 (16.7) 69.06 (2.75)                           3 (25.0)                16.50 (2.15)     28.50 (1.51)   12     -0.53 (2.48)
2  NHL.B.both    1 (7.1) 2 (14.3)   8 (57.1)   3 (21.4) 69.57 (3.52)                           4 (28.6)                17.57 (2.50)     29.21 (1.12)   14     -0.14 (2.18)
3  NHL.W.both  44 (10.8)  6 (1.5) 183 (44.7) 176 (43.0) 69.67 (3.97)                         152 (37.2)                16.74 (2.86)     28.98 (1.14)  409      0.45 (2.36)
4        HL.F    0 (0.0)  1 (6.2)   8 (50.0)   7 (43.8) 71.72 (4.48)                           5 (31.2)                15.25 (2.74)     28.06 (1.98)   16     -1.74 (2.99)
5     NHL.B.F   6 (30.0)  1 (5.0)   9 (45.0)   4 (20.0) 72.52 (5.67)                            1 (5.0)                15.75 (2.38)     28.65 (1.35)   20     -1.22 (3.09)
6     NHL.W.F  57 (10.0) 17 (3.0) 288 (50.6) 207 (36.4) 70.75 (4.57)                         157 (27.6)                17.03 (2.71)     28.85 (1.17)  569      0.11 (2.52)
7        HL.M    3 (5.2)  2 (3.4)  35 (60.3)  18 (31.0) 70.61 (4.27)                          13 (22.4)                16.03 (3.00)     28.53 (1.27)   58     -0.56 (2.56)
8     NHL.B.M   8 (15.1)  2 (3.8)  24 (45.3)  19 (35.8) 69.45 (4.12)                          12 (22.6)                16.08 (2.61)     28.58 (1.49)   53     -0.61 (2.66)
9     NHL.W.M  129 (8.2) 38 (2.4) 821 (52.1) 588 (37.3) 70.63 (4.32)                         496 (31.5)                16.55 (2.75)     28.93 (1.14) 1576      0.31 (2.50)
10    HL.none    4 (8.0)  2 (4.0)  36 (72.0)   8 (16.0) 72.82 (4.85)                          17 (34.0)                15.52 (3.25)     28.16 (1.36)   50     -1.33 (2.38)
11 NHL.B.none  16 (24.2)  3 (4.5)  29 (43.9)  18 (27.3) 71.62 (5.28)                          14 (21.2)                16.23 (3.07)     28.47 (1.19)   66     -1.37 (2.29)
12 NHL.W.none 178 (13.6) 36 (2.8) 784 (59.9) 311 (23.8) 72.76 (4.96)                         371 (28.3)                16.59 (2.95)     28.75 (1.22) 1309     -0.22 (2.51)
13          p                                                 <0.001                               0.01                      <0.001           <0.001   NA           <0.001
   PET.SUVr..mean..SD.. Sex...Female....
1           1.04 (0.14)         7 (58.3)
2           1.06 (0.10)         8 (57.1)
3           1.12 (0.20)       233 (57.0)
4           1.11 (0.17)         9 (56.2)
5           1.03 (0.07)        15 (75.0)
6           1.09 (0.18)       297 (52.2)
7           1.09 (0.19)        36 (62.1)
8           1.06 (0.15)        42 (79.2)
9           1.10 (0.19)      1062 (67.4)
10          1.10 (0.20)        32 (64.0)
11          1.03 (0.14)        43 (65.2)
12          1.09 (0.20)       714 (54.5)
13                 0.01           <0.001

but that still returns the same error. I've also tried :

#Attempt 1: Trying to transpose the demog table to switch put the variable i want to compare across the x axis and the race categories on the y axis
require(data.table)

#transpose the demog table to be readbale by lm function
ERxFH_df_t <- dcast(melt(ERxFH_df, id.vars = "X"), variable ~ X)

colnames(ERxFH_df_t)

#cols.num <- c("HL.both","NHL.B.both","NHL.W.both","HL.F","NHL.B.F","NHL.W.F","HL.M","NHL.B.M N","HL.W.M","HL.none","NHL.B.none","NHL.W.none","p" )

#ERxFH_df[cols.num] <- sapply(ERxFH_df[cols.num],as.numeric)

#sapply(ERxFH_df, class)
#ERxFH_df %>% mutate_if(is.character,as.numeric) #ruined table :(

#rename columns: 
names(ERxFH_df_t)[names(ERxFH_df_t) == "Age (yrs) (mean (SD))" ] <- "age"
names(ERxFH_df_t)[names(ERxFH_df_t) == "APOEgen (%)"] <- "APOEgen"

names(ERxFH_df_t)[names(ERxFH_df_t) == "PET.SUVr..mean..SD.."] <- "AB"
ERxFH_df_t

colnames(ERxFH_df_t)

ERxFH_df_t = ERxFH_df_t[-14,]

ERxFH_df_t

write.csv(ERxFH_df_t, "/Users/phe/Library/CloudStorage/OneDrive-UCLAITServices/UCLA/Deters lab/Family history and cog decline/ERxFH_df_t.csv", row.names=FALSE)

sapply(ERxFH_df_t, class)

#change to numbers in excel
ERxFH_df_t <- read.csv("/Users/phe/Library/CloudStorage/OneDrive-UCLAITServices/UCLA/Deters lab/Family history and cog decline/ERxFH_df_t_nopara.csv")

#ERxFH_df_t %>% mutate(across(where(is.character), as.numeric))
#^^ changed all my values to NA

ERxFH_df_t

#removing APOEgen column since it's empty
ERxFH_df_t <- subset(ERxFH_df_t, select = -APOEgen)

ERxFH_df_t <- na.omit(ERxFH_df_t)

ERxFH_df_t.lm <- lm(formula = age ~ AB,data = ERxFH_df_t)

#still doesnt work
ERxFH_df_t

And i still am met with the same error: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y' In addition: Warning message: In storage.mode(v) <- "double" : NAs introduced by coercion

I'm not sure what to try next!

0

There are 0 best solutions below