(This is a follow-up to this answer, trying to get the q more precise.)
Use Case Constructors to build/access a Set
datatype. Being a set, the invariant is 'no duplicates'. To implement that I need an Eq
constraint on the element type. (More realistically, the set might be implemented as a BST or hash-index, which'll need a more restrictive constraint; using Eq
here to keep it simple.)
- I want to disallow building even an empty set with an unacceptable type.
- " it's now considered bad practice to require a constraint for an operation (data type construction or destruction) that does not need the constraint. Instead, the constraints should be moved closer to the "usage site".", to quote that answer.
- OK so there's no need to get the constraint 'built into' the data structure. And operations that access/deconstruct (like
show
ing or counting elements) won't necessarily needEq
.
Then consider two (or rather five) possible designs:
(I'm aware some of the constraints could be achieved via deriving
, esp Foldable
to get elem
. But I'll hand-code here, so I can see what minimal constraints GHC wants.)
Option 1: No constraints on datatype
data NoCSet a where -- no constraints on the datatype
NilSet_ :: NoCSet a
ConsSet_ :: a -> NoCSet a -> NoCSet a
Option 1a. use PatternSynonym as 'smart constructor'
pattern NilSet :: (Eq a) => () => NoCSet a
pattern NilSet = NilSet_
pattern ConsSet x xs <- ConsSet_ x xs where
ConsSet x xs | not (elemS x xs) = ConsSet_ x xs
elemS x NilSet_ = False
elemS x (ConsSet_ y ys) | x == y = True -- infers (Eq a) for elemS
| otherwise = elemS x ys
GHC infers the constraint
elemS :: Eq t => t -> NoCSet t -> Bool
. But it doesn't infer a constraint forConsSet
that uses it. Rather, it rejects that definition:* No instance for (Eq a) arising from a use of `elemS' Possible fix: add (Eq a) to the "required" context of the signature for pattern synonym `ConsSet'
Ok I'll do that, with an explicitly empty 'Provided' constraint:
pattern ConsSet :: (Eq a) => () => ConsType a -- Req => Prov'd => type; Prov'd is empty, so omittable
Consequently inferred type
(\(ConsSet x xs) -> x) :: Eq a => NoCSet a -> a
, so the constraint 'escapes' from the destructor (also fromelemS
), whether or not I need it at the "usage site".
Option 1b. Pattern synonym wrapping a GADT constructor as 'smart constructor'
data CSet a where CSet :: Eq a => NoCSet a -> CSet a -- from comments to the earlier q
pattern NilSetC = CSet NilSet_ -- inferred Eq a provided
pattern ConsSetC x xs <- CSet (ConsSet_ x xs) where -- also inferred Eq a provided
ConsSetC x xs | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer
pattern ConsSetC :: () => Eq a => a -> NoCSet a -> CSet a
a Provided constraint, but empty Required.Inferred
(\(ConsSetC x xs) -> x) :: CSet p -> p
, so the constraint doesn't escape from a "usage site".But there's a bug: to Cons an element, I need to unwrap the
NoCSet
inside theCSet
in the tail then re-wrap aCSet
. And trying to do that with theConsSetC
pattern alone is ill typed. Instead:insertCSet x (CSet xs) = ConsSetC x xs -- ConsSetC on rhs, to check for duplicates
As 'smart constructors' go, that's dumb. What am I doing wrong?
Inferred
insertCSet :: a -> CSet a -> CSet a
, so again the constraint doesn't escape.
Option 1c. Pattern synonym wrapping a GADT constructor as 'smarter constructor'
Same setup as option 1b, except this monster as ViewPattern for the Cons pattern
pattern ConsSetC2 x xs <- ((\(CSet (ConsSet_ x' xs')) -> (x', CSet xs')) -> (x, xs)) where ConsSetC2 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
GHC doesn't complain about the lack of signature, does infer
pattern ConsSetC2 :: a -> CSet a -> CSet a
with no constraint at all. I'm nervous. But it does correctly reject attempts to build a set with duplicates.Inferred
(\(ConsSetC2 x xs) -> x) :: CSet a -> a
, so the constraint that isn't there doesn't escape from a "usage site".Edit: ah, I can get a somewhat less monstrous
ViewPattern
expression to workpattern ConsSetC3 x xs <- (CSet (ConsSet_ x (CSet -> xs))) where ConsSetC3 x (CSet xs) | not (elemS x xs) = CSet (ConsSet_ x xs)
Curiously inferred
pattern ConsSetC3 :: () => Eq a => a -> CSet a -> CSet a
-- so the Provided constraint is visible, unlike withConsSetC2
, even though they're morally equivalent. It does reject attempts to build a set with duplicates.Inferred
(\(ConsSetC3 x xs) -> x) :: CSet p -> p
, so that constraint that is there doesn't excape from "usage sites".
Option 2: GADT constraints on datatype
data GADTSet a where
GADTNilSet :: Eq a => GADTSet a
GADTConsSet :: Eq a => a -> GADTSet a -> GADTSet a
elemG x GADTNilSet = False
elemG x (GADTConsSet y ys) | x == y = True -- no (Eq a) 'escapes'
| otherwise = elemG x ys
GHC infers no visible constraint elemG :: a -> GADTSet a -> Bool
; (\(GADTConsSet x xs) -> x) :: GADTSet p -> p
.
Option 2a. use PatternSynonym as 'smart constructor' for the GADT
pattern ConsSetG x xs <- GADTConsSet x xs where
ConsSetG x xs | not (elemG x xs) = GADTConsSet x xs -- does infer Provided (Eq a) for ConsSetG
- GHC doesn't complain about the lack of signature, does infer
pattern ConsSetG :: () => Eq a => a -> GADTSet a -> GADTSet a
a Provided constraint, but empty Required. - Inferred
(\(ConsSetG x xs) -> x) :: GADTSet p -> p
, so the constraint doesn't escape from a "usage site".
Option 2b. define an insert function
insertGADTSet x xs | not (elemG x xs) = GADTConsSet x xs -- (Eq a) inferred
- GHC infers
insertGADTSet :: Eq a => a -> GADTSet a -> GADTSet a
; so theEq
has escaped, even though it doesn't escape fromelemG
.
Questions
- With
insertGADTSet
, why does the constraint escape? It's only needed for theelemG
check, butelemG
's type doesn't expose the constraint. - With constructors
GADTConsSet, GADTNilSet
, there's a constraint wrapped 'all the way down' the data structure. Does that mean the data structure has a bigger memory footprint than withConsSet_, NilSet_
? - With constructors
GADTConsSet, GADTNilSet
, it's the same typea
'all the way down'. Is the sameEq a
dictionary repeated at each node? Or shared? - By comparison, pattern
ConsSetC
/constructorCSet
/Option 1b wraps only a single dictionary(?), so it'll have a smaller memory footprint than aGADTSet
structure(?) insertCSet
has a performance hit of unwrapping and wrappingCSet
s?ConsSetC2
in the build direction seems to work; there's a performance hit in unwrapping and wrappingCSet
s? But worse there's a unwrapping/wrapping performance hit in accessing/walking the nodes?
(I'm feeling there's no slam-dunk winner amongst these options for my use case.)
I believe that option 1b is your best bet. Of course, I may be biased, as I'm the one which suggested it on your other question.
To address the issue you've pointed out with your pattern synonym, let us imagine we don't have pattern synonyms. How might we deconstruct a Cons set?
One way is to write a method with this signature:
Which says that given:
CSet a
,Eq a
, ana
, and anotherCSet a
, and returns some arbitrary type,Eq a
and returns the same arbitrary type,we can produce a value of that arbitrary type. Since the type is arbitrary, we know that the values comes from calling the function, or from the given value of that type. The contract of this function is that it invokes the first function and return its result if and only if the set is
ConsSet_
, otherwise, if it isNilSet_
, it invokes the second function.If you squint a little, you can see this function is in a sense "equivalent" to pattern matching on
CSet
. You don't need pattern matching at all anymore; you can do everything with this function that you can do with pattern matching.It's implementation is quite trivial:
Consider now a different form of this function, which accomplishes all the same things, but is maybe a bit easier to use.
Note that the type
forall r . (a -> r) -> (b -> r) -> r
is isomorphic toEither a b
. Also note thatx0 -> y0 -> r
is isomorphic (or close enough) to(x0, y0) -> r
. And finally note thatC a => r
is isomorphic toDict (C a) -> r
where:If we exploit these isomorphisms, we can write
openSetC
differently as:Now the fun part: using
ViewPatterns
, we can use this function directly to easily write the pattern with the signature you want. It's easy only because we've set up the type ofopenSetC'
to match with the type of the pattern you want:As for the rest of your questions, I would strongly suggest splitting them up into different posts so they could all have focused answers.