I am using Django 1.6.8 and MongoEngine 0.8.2.
I have 2 classes, ServiceDocument and OptionDocument. ServiceDocument keeps a list of OptionDocuments. There are millions of ServiceDocuments (2.5 million +).
I want to select every ServiceDocument which has more than two OptionDocuments.
I "want" this to work, but get 0 as result:
ServiceDocument.objects.filter(options__size__gt=2).count()
This is what I get to work:
>>> ServiceDocument.objects.filter(options__size=1).count()
6582
>>> ServiceDocument.objects.filter(options__size=2).count()
2734321
>>> ServiceDocument.objects.filter(options__size=3).count()
25165
>>> ServiceDocument.objects.all().count()
2769768
Lastly, if I had fewer ServiceDocuments and/or I could get an iterator working I could just loop through them myself, but I get segfaults after the memory fills up after a few seconds (I'm guessing any operation on .all() will try to collect them all in memory).
For the iterator, I tried the following without success:
iter(ServiceDocument.objects.all())
Well I think that you need to find a work around for this as mongoengine doesn't support your query. What you can do is add another field say 'options_length' and store the length of options field in this. Then you can query using '__gt > 2'. The additional cost is that you need to override your model save function to update length on every save. Also you need to update the existing records for this.
You can also read this question