Clojure Domain Modeling: Spec vs. Protocols

923 Views Asked by At

This question became really long; I welcome comments suggesting better forums for this question.

I am modelling the swarming behavior of birds. To help me organize my thoughts, I created three protocols representing the main domain concepts I saw: Boid, Flock (collection of boids), and Vector.

As I thought more about it, I realized that I was creating new types to represent Boid and Flock when those could be very cleanly modeled using spec'd maps: A boid is a simple map of position and velocity (both vectors), and a flock is a collection of boid maps. Clean, concise, and simple, and eliminated my custom types in favor of all the power of maps and clojure.spec.

(s/def ::position ::v/vector)
(s/def ::velocity ::v/vector)
(s/def ::boid (s/keys ::position
                      ::velocity))
(s/def ::boids (s/coll-of ::boid))

But while boids are easily represented as a pair of vectors (and a flock could be represented as a collection of boids), I am stumped how to model vectors. I don't know if I want to represent my vectors using Cartesian or polar coordinates, so I want a representation that allows me to abstract that detail away. I want a basic algebra of vector functions regardless of how I store the vector components under the hood.

(defprotocol Vector
  "A representation of a simple vector. Up/down vector? Who cares!"
  (magnitude [vector] "Returns the magnitude of the vector")

  (angle [vector] "Returns the angle of the vector (in radians? from what
  zero?).")

  (x [vector] "Returns the x component of the vector, assuming 'x' means
  something useful.")

  (y [vector] "Returns the y component of the vector, assuming 'y' means
  something useful.")

  (add [vector other] "Returns a new vector that is the sum of vector and
  other.")

  (scale [vector scaler] "Returns a new vector that is a scaled version of
  vector."))

(s/def ::vector #(satisfies? Vector %))

Besides aesthetics of consistency, the biggest reason this discrepancy bothers me is generative testing: I haven't done it yet but I am excited to learn because it will let me test my higher-level functions once I've spec'd my lower-level primitives. Problem is, I don't know how to create a generator for the ::vector spec without coupling the abstract protocol/spec to a concrete record that defines the functionality. I mean, my generator needs to create a Vector instance, right? Either I proxy something right there in the generator, and so create an unnecessary Vector implementation just for testing, or I couple my nicely abstract protocol/spec to a concrete implementation.

Question: How can I model a vector -- an entity where the set of behaviors is more important than a specific data representation -- with a spec? Or, how can I create a test generator for my protocol-based spec without tying the spec to a concrete implementation?

Update #1: To explain it differently, I have created a layered data model where a particular layer is written only in terms of the layer beneath it. (Nothing novel here.)

Flock (functions dealing with collections of boids)
----------------------------------------------------
Boid (functions dealing with a single boid)
----------------------------------------------------
Vector

Because of this model, removing all the higher abstractions would turn my program into nothing but Vector manipulations. A desirable corollary of that fact: If I can figure out a generator for Vectors, I can test all my higher abstractions for free. So how do I spec Vector and create an appropriate test generator?

The obvious but inadequate answer: Create a spec ::vector that represents a map of a pair of coordinates, say (s/keys ::x ::y). But why (x, y)? Some computations would be easier if I had access to (angle, magnitude). I could create ::vector to represent some pair of coordinates, but then those functions that want the other representation must know and care how a vector is stored internally, and so must know to reach for an external conversion function. (Yes, I could implement this using multispec/conform/multimethods but reaching for those tools smells like an unnecessarily leaky abstraction; I don't want the higher abstractions to know or care that Vectors can be represented multiple ways.)

Even more fundamental, a vector isn't (x, y) or (angle, magnitude), those are simply projections of the "real" vector, however you want to define that. (I'm talking domain modeling, not mathematical rigor.) So creating a spec representing a vector as a pair of coordinates is not only a poor abstraction in this case, but it doesn't represent the domain entity.

A better option would be the protocol I defined above. All higher abstractions can be written in terms of the Vector protocol, giving me a clean abstraction layer. However, I can't create a good Vector test generator without coupling my abstraction to a concrete implementation. Maybe that is a trade off I must make, but is there a better way to model this?

2

There are 2 best solutions below

2
On BEST ANSWER

While there are certainly many valid answers to to this question, I'd suggest that you reconsider your goals.

By supporting both coordinate representations in the spec you are stating that they are both supported at the same time. This will inevitably lead to complexity overhead like runtime polymorphism. E. g. your Vector protocol needs to be implemented for Cartesian/Cartesian, Cartesian/Polar, Polar/Cartesian, Polar/Polar. At this point the implementations are coupled and you don't get the intended benefit of "seamlessly" alternating between representations.

I'd settle for one representation and if necessary use an external conversion layer.

4
On

From the our discussion in the comments it seems like you would prefer polymorphism using protocol. I think I understand what you want to do and will try to respond to it.

So suppose you have your vector interface:

(defprotocol AbstractVector

  ;; method declarations go here...

  )

When declaring the AbstractVector protocol, we don't need to know about any specific implementations of that protocol. Along with this protocol, we will also implement place to collect the specs:

(defonce concrete-spec-registry (atom #{}))

(defn register-concrete-vector-spec [sp]
  (swap! concrete-spec-registry conj sp))

Now we can implement this protocol for various classes:

(extend-type clojure.lang.ISeq
  AbstractVector

  ;; method implementations go here...

  )

(extend-type clojure.lang.IPersistentVector
  AbstractVector

  ;; method implementations go here...

  )

but we also need to provide a spec that can be used to generate samples for these implementations:

(spec/def ::concrete-vector-implementation (spec/cat :x number?
                                                     :y number?))
(register-concrete-vector-spec ::concrete-vector-implementation)

Let's define a spec for our abstract vector, by first writing a function that tests if something is an abstract-vector:

(defn abstract-vector? [x]
  (satisfies? AbstractVector x))

;; (assert (abstract-vector? []))
;; (assert (not (abstract-vector? {})))

Or, it is maybe more accurate to implement it like this:

(defn abstract-vector? [x]
  (some #(spec/valid? % x)
        (deref concrete-implementation-registry)))

And here is the spec, along with a generator:

(spec/def ::vector (spec/with-gen (spec/spec abstract-vector?)
                     #(gen/one-of (mapv spec/gen (deref concrete-spec-registry)))))

In the above code, we dereference the atom holding the concrete spec and then build a generator on top of those specs, that will generate using one of them. This way, we don't need to know which concrete vector implementations that exist, as long as their sources have been loaded and the register-concrete-vector-spec function has been used to register the specific specs.

Now we can generate samples:

(gen/generate (spec/gen ::vector))
;; => (-879 0.011494353413581848)