Equivalent of SQL "limit" clause in Datomic

3.1k Views Asked by At

Title sort of says it all, but say I have a simple query as follows:

(q '[:find ?c ?n :where [?c :my-thing/its-attribute ?n]]
   (d/db conn))

against a schema like

[{:db/id (d/tempid :db.part/db)
  :db/ident :my-thing/its-attribute
  :db/valueType :db.type/string
  :db/doc "My thing's attribute"
  :db/cardinality :db.cardinality/one
  :db.install/_attribute :db.part/db}]

If the query matches everything (say, 100M entries), the returned results will be large. If I just want a few of them, what's the best way to do that?

6

There are 6 best solutions below

2
On

Have you tried using get-some ?

From: http://docs.datomic.com/query.html

get-some

The get-some function takes a database, entity, and one or more cardinality-one attributes, returning a tuple of the entity id and value for the first attribute possessed by the entity.

[(get-some $ ?person :person/customer-id :person/email) ?identifier]

-- Edit to respond to comment --

You could also try making a query that selects entities below a certain number.

user> (defn example-take-query [n]
        (into '[:find ?e :where [?e :age ?a]]
              [[`(~'> ~n ~'?e)]]))
#'user/example-take-query
user> (example-take-query 3)
[:find ?e :where [?e :age ?a] [(> 3 ?e)]]
user> (example-take-query 10)
[:find ?e :where [?e :age ?a] [(> 10 ?e)]]
1
On

The equivalent of a SQL "limit" clause can be achieved with the key :limit when using a query-map.

 (d/q {:query '[:find ?c ?n :where [?c :my-thing/its-attribute ?n]] 
       :offset 1
       :limit 10
       :args [(d/db conn)]})

You can read more about this in the datomic client api documentation:

https://docs.datomic.com/client-api/datomic.client.api.html

or about the query-map syntax:

https://docs.datomic.com/on-prem/query.html#timeout

1
On

I had a similar need a while back put together a clojure MySql method of LIMIT for a collection:

(defmacro limit 
"Pagination mimicking the MySql LIMIT" 
([coll start-from quantity]
    `(take ~quantity (drop ~start-from ~coll)))
([coll quantity]
    `(limit ~coll 0 ~quantity)))

Simple example usage on the repl:

user=>  (defmacro limit 
  #_=>     "Pagination mimicking the MySql LIMIT" 
  #_=>     ([coll start-from quantity]
  #_=>     `(take ~quantity (drop ~start-from ~coll)))
  #_=>     ([coll quantity]
  #_=>         `(limit ~coll 0 ~quantity)))
#'user/limit
user=> (def hundred (take 100 (iterate inc 0))) ;; define a collection
#'user/hundred
user=> (limit hundred 25) ;; get the first 25 from the collection
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24)
user=> (limit hundred 25 25) ;; get the next 25
(25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49)

Not sure it exactly answers your question but it might be of some use.

1
On

Two random names using rand (duplicates tolerated) and sample (only distinct)

(d/q '[:find [(rand 2 ?name) (sample 2 ?name)]
       :where [_ :artist/name ?name]]
     db)

This example came from the day-of-datomic github repo.

0
On

This answer is sort of a compilation of that of @adamneilson and the comments on the original question. I was trying to accomplish the same thing as the OP, but wasn't quite able to find my answer here, so hopefully this'll help someone.

My use case was to pull 100k records with pagination. It was absolutely not feasible to simply use take/drop, as it took a really long time (tens of seconds).

My workaround was to first fetch the entity ids required, do take/drop on that collection, then map over them with entity. Here's my final code:

(defn eid->entity
  [eid]
  (into {} (d/touch (d/entity (d/db (get-conn)) eid))))

(defn find-eids
  [attr value limit offset]
  (let [query '[:find ?eid
                :in $ ?attr ?value
                :where [?eid ?attr ?value]]
        db (d/db (get-conn))
        result (drop offset (sort (d/q query db attr value)))]
    (map first (take limit result))))

(map eid->entity (find-eids :attr-name "value" 10 10)

This feels super wrong to my SQL-trained brain, but I think it's the datomic way. And it's not terribly slow - about 500ms for 100k records, which is good enough for me.

1
On

For the simple cases of "all attribute and value pairs, sorted", using take with seek-datoms is your best option. The following example uses the mbrainz sample database:

(def conn (d/connect "datomic:sql://mbrainz-1968-1973?jdbc:postgresql://localhost:5432/datomic?user=datomic&password=datomic"))

(->> (d/seek-datoms (d/db conn) :avet :artist/sortName "Bea")
     (take 20))

and returns:

(#datom[17592186050196 81 "Beach Boys, The" 13194139539089 true] #datom[17592186047857 81 "Beatles, The" 13194139536749 true] #datom[17592186048553 81 "Beau Brummels, The" 13194139537425 true] #datom[17592186049043 81 "Beaver & Krause" 13194139537919 true] #datom[17592186046205 81 "Beaver, Paul" 13194139535085 true] #datom[17592186046692 81 "Beck, Bogert & Appice" 13194139535579 true] #datom[17592186046886 81 "Beck, Jeff" 13194139535761 true] #datom[17592186047111 81 "Beck, Jeff Group" 13194139535995 true] #datom[17592186046486 81 "Bedford, David" 13194139535371 true] #datom[17592186046992 81 "Bee Gees" 13194139535865 true] #datom[17592186045876 81 "Beethoven, Ludwig van" 13194139534747 true] #datom[17592186048427 81 "Beggars Opera" 13194139537321 true] #datom[17592186047091 81 "Beginning of the End, The" 13194139535969 true] #datom[17592186045945 81 "Belafonte, Harry" 13194139534825 true] #datom[17592186047485 81 "Bell, Archie & Drells, The" 13194139536359 true] #datom[17592186045915 81 "Bell, Carey" 13194139534799 true] #datom[17592186046324 81 "Bell, Vinnie" 13194139535215 true] #datom[17592186047164 81 "Bell, William" 13194139536047 true] #datom[17592186047652 81 "Belle, Marie-Paule" 13194139536541 true] #datom[17592186046496 81 "Bellou, Sotiria" 13194139535371 true])

Of course, you can map a fn that prettifies the output or adds more attributes, etc. such as in this example:

(let [db (d/db conn)]
  (->> (d/seek-datoms db :avet :artist/sortName "Bea")
       (take 20)
       (map #(merge {:artist/name (:v %)
                     :artist/type (-> (d/pull db [{:artist/type [:db/ident]}] (:e %))
                                      :artist/type
                                      :db/ident)}))))

Which returns:

({:artist/name "Beach Boys, The" :artist/type :artist.type/group} {:artist/name "Beatles, The" :artist/type :artist.type/group} {:artist/name "Beau Brummels, The" :artist/type :artist.type/group} {:artist/name "Beaver & Krause" :artist/type :artist.type/group} {:artist/name "Beaver, Paul" :artist/type :artist.type/person} {:artist/name "Beck, Bogert & Appice" :artist/type :artist.type/group} {:artist/name "Beck, Jeff" :artist/type :artist.type/person} {:artist/name "Beck, Jeff Group" :artist/type :artist.type/group} {:artist/name "Bedford, David" :artist/type :artist.type/person} {:artist/name "Bee Gees" :artist/type :artist.type/group} {:artist/name "Beethoven, Ludwig van" :artist/type :artist.type/person} {:artist/name "Beggars Opera" :artist/type :artist.type/group} {:artist/name "Beginning of the End, The" :artist/type :artist.type/group} {:artist/name "Belafonte, Harry" :artist/type :artist.type/person} {:artist/name "Bell, Archie & Drells, The" :artist/type :artist.type/group} {:artist/name "Bell, Carey" :artist/type :artist.type/person} {:artist/name "Bell, Vinnie" :artist/type :artist.type/person} {:artist/name "Bell, William" :artist/type :artist.type/person} {:artist/name "Belle, Marie-Paule" :artist/type :artist.type/person} {:artist/name "Bellou, Sotiria" :artist/type :artist.type/person})

Note: to use seek-datoms or datoms with :avet, the attribute in question must be indexed.