Aggregate,Find,Group confusion?

Question

Aggregate,Find,Group confusion?

214 Views Asked by Phalguni Mukherjee At 14 September 2013 at 05:44

I am building a web based system for my organization, using Mongo DB, I have gone through the document provided by mongo db and came to the following conclusion:

find: Cannot pull data from sub array.
group: Cannot work in sharded environment.
aggregate:Best for sub arrays, but has performance issue when data set is large.
Map Reduce : Too risky to write map and reduce function.

So,if someone can help me out with the best approach to work with sub array document, in production environment having sharded cluster.

Example:

{"testdata":{"studdet":[{"id","name":"xxxx","marks",80}.....]}}

now my "studdet" is a huge collection of more than 1000, rows for each document,

So suppose my query is:

"Find all the "name" from "studdet" where marks is greater than 80"

its definitely going to be an aggregate query, so is it feasible to go with aggregate in this case because ,"find" cannot do this and "group" will not work in sharded environment, so if I go with aggregate what will be the performance impact, i need to call this query most of the time.

Original Q&A

There are 1 best solutions below

**Charlie Page** · Answer 1 · 2013-09-16T15:09:46.500000

Please have a look at: http://docs.mongodb.org/manual/core/data-modeling/ and http://docs.mongodb.org/manual/tutorial/model-embedded-one-to-many-relationships-between-documents/#data-modeling-example-one-to-many

These documents describe the decisions in creating a good document schema in MongoDB. That is one of the hardest things to do in MongoDB, and one of the most important. It will affect your performance etc. In your case running a database that has a student collection with an array of grades looks to be the best bet. {_id:, …., grades:[{type:”test”, grade:80},….]} In general, and, given your sample data set, the aggregation framework is the best choice. The aggregation framework is faster then map reduce in most cases (certainly in execution speed, it is C++ vs javascript for map reduce).
If your data's working set becomes so large you have to shard then aggregation, and everything else, will be slower. Not, however, slower then putting everything on a single machine that has a lot of page faults. Generally you need a working set larger then the RAM available on a modern computer for sharding to be the correct way to go such that you can keep everything in RAM. (At this point a commercial support contract for Mongo for assistance is going to be a less then the cost of hardware, and that include extensive help with schema design.)

If you need anything else please don’t hesitate to ask.

Best, Charlie

Aggregate,Find,Group confusion?

There are 1 best solutions below

Related Questions in MONGODB

Related Questions in PYMONGO

Related Questions in MONGO-JAVA

Trending Questions

Popular # Hahtags

Popular Questions