I have a dataset that is a cross-join between N individuals and M books. M = 50 and N = 1,000. So that means that person 1 has a row matched with book 1, then book 2, then book 3, etc. and so on with all 1000 people. This dataset contains data on characteristics on both the individual and books.
Each person has ranked their 5 favorite books. Rank 1 being most preferred, Rank 2 being second preferred and so on. I want to estimate aggregate group preferences for each characteristic of a book based on rankings. So, if a person's utility for a given book m is the sum of some vector of coefficients "b" times a vector of characteristics for that book (ex. length, genre), and people rank the first the book that gives them the highest utiltiy and rank second the book that give them the highest utility, I want to estimate the b's via logit MLE.
So, we want to find the b's that maximize the log likelihood function which is the summation from n=1 to N=1000 of the summation from m=1 to M=5 of log((exp bXjm )/summation from j=1 to 50 of (exp bXj)) where j are the books indexed from 1 to 50 and jm is the book ranked first, second, third, fourth, or fifth as indicated by what "m" is for a given individual n. Note, b*Xjm is a vector of bs and characteristics: ex. b#ofpages,1X#ofpages,1 + bage of author,1Xage of author,1 +..., where 1 indicates the book j ranked at m slot for person n.
I tried the package 'pmr' and its subsequent command rol() command, but I am confused by the usage that explains that you insert rol(dataset, covariate) and the example of the package does as follows:
## create an artificial dataset
X1 <- c(1,1,2,2,3,3)
X2 <- c(2,3,1,3,1,2)
X3 <- c(3,2,3,1,2,1)
X4 <- c(6,5,4,3,2,1)
test <- data.frame(X1,X2,X3)
## fit the Luce model
rol(test, X4)
This returns
Coefficients:
Beta0item1 Beta1item0 Beta1item1 Beta1item2
-6.9393820 -1.6710604 2.0350267 0.4255481
But I am confused by how X4 is input into the command when it is not included inthe actual dataset. Further, why does inputting X4 result in the above coefficients? What would beta1item1 be in this case? or beta0item1 etc? and what would I need to input to get the results needed for my problem?