How to use set/intersection with big result sets from MongoDB

161 Views Asked by At

I've a function photos-with-keyword-starting that gets lists of photos for a given keyword from a MongoDB instance using monger, and another that finds subsets of these photos using set/intersection.

(defn photos-with-keywords-starting [stems]
  (apply set/intersection
         (map set
              (map photos-with-keyword-starting stems))))

Previously I thought this worked fine, but since adding more records the intersection doesn't work as expected -- it misses lots of records that have both keywords.

I notice that calls to the function photos-with-keyword-starting always return a maximum of 256 results:

=> (count (photos-with-keyword-starting "lisa"))
256

Here's the code of that function:

(defn photos-with-keyword-starting [stem]
  (with-db (q/find {:keywords {$regex (str "^" stem)}})
    (q/sort {:datetime 1})))

So because calls to find records in MongoDB don't return all records if there are more than 256, I don't get the right subsets when specifying more than one keyword.

How do I increase this limit?

1

There are 1 best solutions below

1
On

You could simply convert the datetime in your function photos-with-keyword-starting to for instance a string, if you can live with that.

Alternatively you could remove logical duplicates from your output, for instance like this:

(->> 
  -your-result-  
  (group-by #(update % :datetime str)) 
  (map (comp first val)))