I am trying to implement a message archiving in riak. The schema looks something like this
{
id = <<>> :: binary() | '_',
username_s = <<"">> :: binary() | '_',
timestamp_i = 0 :: integer(),
peer_s = <<"">> :: binary(),
bare_peer_s = <<"">> :: binary(),
packet = #xmlel{} :: xmlel() | '_',
nick_s = <<"">> :: binary(),
type_s = chat :: chat | groupchat
}
id and packet doesn't need to be indexed but need to be queried on every query.
Should I create a custom scheme and store them in Solr as non-indexed field?
Should I do application level join of search result while individually query each key ?
Or is MapReduce somehow an option ?
Or something else entirely ?
Thank you.
I am going to answer this in case someone need an answer for this question. I have move on to other project and no solution is picked at the time I leave the project.
Yokozuna, Riak integration plugin with Solr, is terrible in my evaluation. I pick "create a custom scheme and store them in Solr as non-indexed field" for my evaluation. Indexing something like a paragraph of text takes a lot of CPU and time in yokozuna so try not to do that for something as quickly updated as text messages maybe do that for something like products. Even without indexing the paragraph Yokozuna perform terribly compared to something like MySQL or Cassandra. There is also no support from Riak because perhaps Basho has closed since January 2017. So, I conclude that MySQL is very much enough for current load and Cassandra is the probably the next best option if MySQL is not enough.