I have a data frame with 10k rows and 500 columns. For each column, I want to create a count for each unique value in the row. E.g.
Fruit Vegetable Meat
1 Apple Carrot Steak
2 Apple Potato Chicken
3 Pear Peas Duck
Would produce:
Fruit;Apple;2;Pear;1
Vegetable;Carrot;1;Potato;1;Peas;1
Meat;Steak;1;Chicken;1;Duck;1
The Hmisc describe function produces this kind of analysis, but the output is so badly formatted as to be useless.
Thanks.
You will also see a list of three NULLs which would not be sent to a text file. Writing tables and matrices to files is not a strong point of R. There is a
write.matrix
function in package::MASS. My initial effort withwriteLines
failed because it has no 'append' option and I wasn't able to cobble together a connection call that would do theappend
.(The other gotcha' in R is that processing a list (and by inheritance a dataframe) with 'apply/lapply/sapply' does not pass the
names
of the list-element (andcolnames
for dataframes) to the function, so "write" functions would not have the names internally for writing to a file. That is why I worked withnames(df)
rather than justdf
.As a further note, there are probably JSON-writing functions out there and they might be more reliable. I'll take a look and report back.
There is the RJSONIO package: