DirichletReg predict 'contrasts can be applied only to factors with 2 or more levels'

130 Views Asked by At

I have run a Dirichlet regression model using the DirichletReg package in R. It works perfectly. All variables are continuous apart from the ones with factor().

Dirichlet <- DirichReg(Y ~ DD_GS + pGS + Frst_dy 
                   + pp_grwt + emply_r
                   + Age + Off_frm + High_dc + Owned + irrigbl +
                   + Altitud + AWC 
                   + Water_cost + Elec_cost + Lbr_perH
                   + factor(Erdblty) + factor(Soil) + factor(Slope) + factor(consortium) + factor(RBD)
                   | DD_GS + pGS, # Precision model
                   base = 4,
                   model = "alternative",
                   data = Italy_data)

Now, I want to make predictions for Y (compositional variable) for varying levels of DD_GS, all other variables kept constant.

I create a newdata dataframe with 200 rows. The first column is created as follows: DD_GS = seq(min(Italy_data$DD_GS), max(Italy_data$DD_GS), length.out = 200). For all other columns I use marginaleffects::datagrid to set all continuous variables to their mean and all factor variables to their mode.

Next, I try to make the predictions as follows:

pred <- predict(object = Dirichlet, newdata = newdata, mu = TRUE, phi = TRUE)

This gives the following error: contrasts can be applied only to factors with 2 or more levels

I do not know how to solve this issue. Yes, the factor variables in newdata are set at a constant value, but marginaleffects::datagrid does keep all the levels.

lapply(newdata[,c('consortium', 'RBD', 'Soil', 'Slope', 'Erdblty')], levels)

$consortium
[1] "Individual" "Consortium"

$RBD
[1] "ITA" "ITB" "ITC" "ITD" "ITE" "ITF" "ITG" "ITH"

$Soil
[1] "Coarse"             "Medium"             "Medium fine - Fine"

$Slope
[1] "Level"            "Sloping"          "Moderately steep" "Steep"           

$Erdblty
[1] "Weak - Moderate" "Strong"          "Very strong" 

Reproducible example with just 1 factor variable (50 randomly drawn observations from the dataset):

sample <- structure(list(surface = c(0.674494219653179, 0.204576976421637, 
0.011614401858304, 0.527835380443452, 0, 0.789875296601754, 0.980739599383667, 
0.88386524822695, 0, 0.591486005383658, 0, 0.060767161586851, 
0, 0.034715743520991, 0.054904214002066, 0.183772856749058, 0.275319567354966, 
0, 0.012977759682427, 0.009820029082394, 0.090993387918861, 0.002690582959641, 
0.203759210309442, 0.008999593566742, 0, 0.095082409465206, 0.538424085044226, 
0.721045337556919, 0.023917495302637, 0, 0.032506505366928, 0.653077008065407, 
0.033836040216551, 0.763726555239977, 0.216644521107359, 0.504201680672269, 
0, 0.507080383173678, 0.202140877377425, 0.174181593576282, 0.141584158415842, 
0.891552301409176, 0.853267019288966, 0.051957898492552, 0.589030200258398, 
0, 0.040265127603932, 0.052724077328647, 0.172470523718124, 0.003320216428923
), sprnklr = c(0.007225433526012, 0.324895977808599, 0, 0.014834901898229, 
0.833333333333333, 0.08480458256243, 0.015408320493066, 0, 0.694227769110764, 
0.166575009406929, 0.333333333333333, 0.439222248109631, 0.555918901242642, 
0.35756152150067, 0.112594004101997, 0.384360929890655, 0.041297935103245, 
0.367088607594937, 0.188915466435951, 0.250870268602975, 0.195489724304701, 
0.008388288050646, 0.532723687664443, 0.157579980258956, 0.308746355685131, 
0.777982606014671, 0.26138115718575, 0.028113244902, 0.433318555112209, 
0.000864005529635, 0.181987639802837, 0.123080322616285, 0.149748646558391, 
0.129078701098124, 0.124059829580778, 0, 0.298507462686567, 0.035401915868388, 
0.358310236069961, 0.46757257566399, 0.034653465346535, 0, 0, 
0.00999018820801, 0.265483688630491, 0, 0.605250698890793, 0.738137082601054, 
0.071839868384974, 0.181013280865716), drip = c(0, 0.150138696255201, 
0, 0.015951507417451, 0, 0.003256086215727, 0, 0, 0.007020280811232, 
0.000434165967177, 0, 0.074005040984475, 0, 0.391822986493524, 
0.296114739552271, 0.001378296425618, 0.009832841691249, 0, 0.359611176141279, 
0.369623376705128, 0.024502881800716, 0.015985228171986, 0.000781388307257, 
0.252801486384486, 0, 0.009030998291433, 0.003083664692039, 0.046129479311028, 
0.285812907720565, 0.99706238119924, 0.303725573598219, 0.007733952049497, 
0.35092807424594, 0.03011661845739, 0.014079699633074, 0, 0.37910447761194, 
0.031028738025823, 0.185415272866291, 0, 0.024257425742574, 0, 
0, 0.232138078672732, 0.12640100129199, 0, 0.001127243213996, 
0.121265377855888, 0.488620784206197, 0.098130841121495), none = c(0.318280346820809, 
0.320388349514563, 0.988385598141696, 0.441378210240868, 0.166666666666667, 
0.122064034620089, 0.003852080123267, 0.11613475177305, 0.298751950078003, 
0.241504819242236, 0.666666666666667, 0.426005549319044, 0.444081098757358, 
0.215899748484815, 0.536387042343666, 0.430487916934669, 0.673549655850541, 
0.632911392405063, 0.438495597740343, 0.369686325609503, 0.689014005975721, 
0.972935900817726, 0.262735713718859, 0.580618939789816, 0.691253644314869, 
0.117903986228691, 0.197111093077984, 0.204711938230053, 0.256951041864589, 
0.002073613271125, 0.481780281232016, 0.21610871726881, 0.465487238979118, 
0.077078125204508, 0.645215949678788, 0.495798319327731, 0.322388059701493, 
0.426488962932112, 0.254133613686323, 0.358245830759728, 0.799504950495049, 
0.108447698590824, 0.146732980711034, 0.705913834626706, 0.019085109819121, 
1, 0.35335693029128, 0.087873462214411, 0.267068823690705, 0.717535661583866
), DD_GS = c(2968.00560026169, 3751.41458040873, 3217.52741529526, 
3848.91491320928, 1927.85096128071, 3550.94684754189, 2241.87228460535, 
3614.32649774272, 3199.51859460672, 3717.77324947516, 3194.75085554506, 
3400.11977087351, 3685.30887317156, 3857.30516098449, 3838.13691473007, 
3617.18991976579, 3400.34358790716, 3227.46888205538, 3646.42691558202, 
3724.26569058204, 3761.16810912534, 3376.88059224486, 3699.23691791693, 
3500.38599236724, 3155.38254563411, 3397.96065988968, 3368.09592208862, 
3647.57725006739, 3743.63691547712, 2113.24415100271, 3873.03191526731, 
3894.91424705188, 3645.85869934792, 3500.25358777046, 3581.39257631048, 
2494.00055927553, 3486.85059129, 3354.72711553542, 3646.48491806984, 
3302.99859150251, 3970.50791212718, 3552.09031788937, 3664.00229212624, 
3681.69588200819, 3691.2199163119, 3222.3492624402, 3640.57125096321, 
3348.4870373125, 3828.54791437785, 3835.5786033004), pGS = c(43.6233339269956, 
9.68033344385525, 22.5857469112471, 13.5783334638303, 60.6025284856629, 
42.590896895604, 50.9563339928786, 49.6678102401355, 71.4286677598953, 
32.3690003472691, 39.0659949408812, 57.0566765463567, 31.6217083760117, 
12.7534094122541, 0.187164324840153, 33.8233339053889, 42.6433339354893, 
41.9003483838743, 33.5853337883949, 29.9532291762421, 31.133683975243, 
20.0586668558419, 30.4943336471915, 39.494217910894, 70.2377748857651, 
48.4968166630066, 14.5713335371266, 0.271368456060452, 0.140866820180555, 
61.2742631397292, 5.08498906209963, 24.4090002759049, 33.5962737182986, 
46.6176672200362, 43.5942706626686, 43.7305006815205, 35.6380004849285, 
31.780778191319, 23.450667004846, 12.7074745386894, 12.80066678791, 
49.8169722690753, 38.5050368231085, 17.2678973036089, 36.3513338531057, 
39.17766720665, 41.8666672809049, 37.1276613497028, 23.3853335704034, 
24.0797391735994), Frst_dy = c(26.6000003814697, 0, 6.79459825149864, 
0, 79.6177463668713, 13.4090086318351, 55.6333351135254, 10.141758614844, 
13.8000001907349, 16.533332824707, 21.3532939622569, 15.9774780660062, 
1.03116612094844, 0.066666670143604, 0.633333325386047, 3.83333325386047, 
5.40000009536743, 7.85379497166757, 6.5, 8.73668674040123, 9.76949150279417, 
2.93333339691162, 14.4333333969116, 3.07706770592166, 15.6457143987928, 
15.9840791901546, 0, 0, 1.39999997615814, 66.9353894035124, 0.066666670143604, 
0.100000001490116, 6.51215547757098, 10.1666669845581, 6.93792162018749, 
28.081513757962, 14.6333332061768, 2.35816992030424, 0.333333343267441, 
0, 0, 14.0890475341252, 10.765676791125, 1.31416432580597, 12.0333337783813, 
4.26666688919067, 8.96666622161865, 2.64444446981999, 0.266666680574417, 
0.300410520927659), Water_cost = c(15.159401069334, 741.410199949408, 
44.1128234504789, 118.930819755236, 40.1378496553326, 81.3629847112177, 
44.6247913746641, 99.7595198373655, 84.1518125597496, 30.935342741554, 
66.9849957382298, 87.730091879798, 19.0133607399794, 145.190317902846, 
97.9966376583065, 80.3855972338419, 54.5588243299056, 88.2626317764064, 
68.8370470888661, 80.4858848493403, 83.337178585515, 39.6387620714554, 
61.9561804469798, 57.5884030146462, 99.7595198373655, 90.3971655588301, 
312.292977421483, 118.930819755236, 78.8873515924867, 40.1378496553326, 
150.611937832493, 182.331002331002, 68.8370470888661, 56.5844483283989, 
20.6611570247934, 106.7277485873, 71.5580909267444, 65.9408793085246, 
8.51949657224839, 50.6480210613444, 161.455177691785, 99.7595198373655, 
99.7595198373655, 66.5806691753691, 106.925680868725, 104.087107914717, 
79.2321702039264, 637.336309637199, 48.5471887646224, 141.591513157763
), Elec_cost = c(106.061028571932, 131.124625576094, 33.2004715158184, 
372.193455204937, 66.7087170302092, 56.1880979039094, 181.539225505817, 
44.9926630023908, 45.9525840907786, 52.2417065857706, 28.5531626926004, 
164.733603034497, 141.443987667009, 27.8223295934657, 40.2276127492689, 
38.3459776418391, 87.0149099218594, 70.5518595167352, 38.7674416490254, 
18.0717630944567, 14.6394168344347, 33.8585599858388, 88.298699776267, 
338.107103429976, 45.9525840907786, 249.896520428349, 261.813409620699, 
372.193455204937, 40.1408129589757, 66.7087170302092, 148.084183744674, 
158.358096019086, 38.7674416490254, 137.990718176415, 100.423095446722, 
135.352365498922, 19.7934772260351, 38.4372684192437, 54.2890041484468, 
78.1181718747752, 310.668498168498, 45.4726235465847, 44.032741914003, 
37.3721306025641, 111.975554282281, 159.207453641412, 187.622988446114, 
1021.01467122031, 130.445945070256, 124.191827969335), Lbr_perH = c(10.2649006622517, 
9.68422826613348, 7.49776574142812, 9.80270833333333, 9.90606747235191, 
8.91010470190489, 8.69495033112583, 7.85583333333333, 8.9633523967651, 
11.7950932281596, 10.4189310055178, 23.1649597418913, 18.3407142857143, 
8.19963132105989, 7.04565523952884, 12.575913015873, 14.0511215636216, 
12.0529668316921, 13.7956759147937, 13.4574814663301, 13.2162502694075, 
7.27222554111555, 10.1877225297846, 12.5220661525974, 7.85583333333333, 
11.5520171131272, 9.97147334354611, 9.80270833333333, 7.4916706257168, 
9.90606747235191, 7.62995494769574, 7.5070324838287, 13.7956759147937, 
9.38291089566135, 14.2318405831827, 7.46717449310773, 12.8114981772859, 
7.36119418364753, 6.3861854884546, 6.83288389193814, 7.76438356164384, 
7.85583333333333, 7.85583333333333, 7.04617029016591, 11.0823792379777, 
9.63892876811744, 11.764777284628, 7.75850340136054, 7.67706558187183, 
11.3576476190476), RBD = structure(c(2L, 8L, 6L, 8L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 1L, 3L, 6L, 6L, 3L, 3L, 2L, 3L, 3L, 3L, 6L, 
1L, 3L, 2L, 1L, 8L, 8L, 6L, 1L, 6L, 6L, 3L, 2L, 1L, 5L, 3L, 6L, 
6L, 6L, 6L, 2L, 2L, 6L, 1L, 5L, 1L, 3L, 6L, 5L), .Label = c("ITA", 
"ITB", "ITC", "ITD", "ITE", "ITF", "ITG", "ITH"), class = "factor")), row.names = c(34L, 
6375L, 5506L, 6231L, 4425L, 4010L, 4342L, 4005L, 184L, 3709L, 
3758L, 5825L, 4862L, 5852L, 5272L, 6002L, 212L, 3454L, 4248L, 
2549L, 2560L, 5514L, 3694L, 2488L, 269L, 4832L, 6310L, 6195L, 
5695L, 1406L, 3256L, 5499L, 1376L, 4336L, 4502L, 3564L, 4301L, 
2578L, 1275L, 3762L, 1935L, 1747L, 3686L, 4439L, 4334L, 4965L, 
4593L, 2730L, 5463L, 2452L), class = "data.frame")

sample$Y <- DR_data(sample[,c("surface", "sprnklr", "drip", "none")], trafo = TRUE, base = 4)

Dirichlet <- DirichReg(Y ~ DD_GS + pGS + Frst_dy 
                       + Water_cost + Elec_cost + Lbr_perH
                       + factor(RBD)
                       | DD_GS + pGS, 
                       base = 1,
                       model = "alternative",
                       verbosity = 4,
                       data = sample)

X <- data.frame(DD_GS = seq(min(sample$DD_GS), max(sample$DD_GS), length.out = 200))
varnames <- c('pGS', 'Frst_dy', "Water_cost", "Elec_cost", "Lbr_perH", "RBD")
X <- cbind(X, marginaleffects::datagrid(Italy_data[,varnames], newdata = Italy_data[,varnames], grid.type = 'typical'))

pred <- predict(object = Dirichlet, newdata = X, mu = TRUE, alpha = TRUE, phi = TRUE)
0

There are 0 best solutions below