I'm confused about how to choose between data family
and type family
. The wiki page on TypeFamilies goes into a lot of detail. Occasionally it informally refers to Haskell's data family
as a "type family" in prose, but of course there is also type family
in Haskell.
There is a simple example that shows where two versions of code are shown, differing only on whether a data family
or a type family
is being declared:
-- BAD: f is too ambiguous, due to non-injectivity
-- type family F a
-- OK
data family F a
f :: F a -> F a
f = undefined
g :: F Int -> F Int
g x = f x
type
and data
here have the same meaning, but the type family
version fails to type-check, while the data family
version is fine, because data family
"creates new types and are therefore injective" (says the wiki page).
My takeaway from all of this is "try data family
for simple cases, and, if it isn't powerful enough, then try type family
". Which is fine, but I'd like to understand it better. Is there a Venn diagram or a decision tree I can follow to distinguish when to use which?
I don't think any decision tree or Venn diagram will exist because the applications for type and data families are pretty wide.
Generally you've already highlighted the key design differences and I would agree with your takeaway to first see if you can get away with
data family
.For me the key point is that each instance of a
data family
creates a new type, which does substantially limit the power because you can't do what is often the most natural thing and make an existing type be the instance.For example the
GMapKey
example on the Haskell wiki page on "indexed types" is a reasonably natural fit for data families:The key type of the map
k
is the argument to the data family and the actual map type is the result of the data family (GMap k
) . As a user of aGMapKey
instance you're probably quite happy for theGMap k
type to be abstract to you and just manipulate it via the generic map operations in the type class.In contrast the
Collects
example on the same wiki page is sort of the opposite:The argument type is the collection and the result type is the element of the collection. In general a user is going to want to operate on those elements directly using the normal operations on that type. For example the collection might be
IntSet
and the element might beInt
. Having theInt
be wrapped up in some other type would be quite inconvenient.Note - these two examples are with type classes and therefore don't need the
family
keyword as declaring a type inside a type class implies it must be a family. Exactly the same considerations apply as for standalone families though, it's just a question of how the abstraction is organised.