To start off this whole thing I'm working with a pattern synonym defined as follows:
{-# Language PatternSynonyms #-}
pattern x := y <- x @ y
This allows me to run multiple pattern matches across a parameter at once. A regular as binding (@
) does not allow the left hand side to be a pattern but this does.
With this I make the following toy function
{-# Language ViewPatterns #-}
f ((_:_) := (head -> y)) =
[ y ]
f [] =
[]
It's not the best way to implement this I am sure, but it's a minimum working example for the behavior in question.
This has a function that takes a single parameter.
It matches the parameter against two patterns using the defined synonym.
The first pattern matches any non-empty list and makes no bindings.
The second runs the head function on the list and binds y
to the result.
So the question is can head
cause an error or will the other pattern prevent it?
>>> f []
[]
The other pattern prevents it! Alright so if I do them in the other order then it should break right?
f' ((head -> y) := (_:_)) =
[ y ]
f' [] =
[]
>>> f' []
[]
Nope! It still works. So now my question is: Is the second pattern doing anything at all? Maybe view patterns has some sort of smart behavior where it calls the function and fails the pattern if an error occurs ...
f'' (head -> y) =
[ y ]
f'' [] =
[]
>>> f'' []
[*** Exception: Prelude.head: empty list
No ... it doesn't. This fails. Somehow (_:_)
blocks the error no matter what side it's on. Maybe ghc prefers to match destructuring patterns before view patterns? To test this I can replace the pattern (_:_)
with (reverse -> _:_)
. This way it has to run a function before it can get to the destructuring.
But having tested it, the new pattern doesn't change the behavior. This hypothesis can be ruled out.
So maybe it's laziness? x
can't be evaluated if the list is empty so it sits in thunk and the error never occurs. It seems to somewhat be the case. If I replace (head -> x)
with (undefined -> x)
we have no change in behavior.
However if I replace it with (undefined -> "yo")
:
f ((undefined -> "yo") := (x:_)) = [x]
f [] = []
>>> f []
*** Exception: Prelude.undefined
The undefined
does get evaluated. Which seems to indicate that the pattern is forcing evaluation to compare with "yo"
. And if I now switch the order:
f ((x:_) := (undefined -> "yo")) = [x]
f [] = []
>>> f []
[]
It isn't evaluated. It seems that now we are short circuiting the pattern match.
So the laziness hypothesis seems to make sense? It's still very opaque to me and I would love to have someone with more experience as to the internals of ghc confirm this hypothesis.
So my question is now what is going on? Is it laziness? How does it work?
A big thanks to discord user lexi. They helped a lot in the diagnosis thus far.
You are indeed observing the effect of laziness.
Let's start with a much more basic example:
Laziness makes
f undefined
return42
. This is because the variable patternx
does not require the argument to be evaluated, soundefined
is never demanded.By comparison, if we used
then
f undefined
does crash, since the pattern()
requires the argument to be evaluated until it exposes the()
constructor (which, in this case, means fully evaluated).Similarly,
will cause
f undefined
to return42
, whilewill cause
f undefined
to crash, after trying to evaluateundefined
so to expose the first list constructor (either:
or[]
).We also have that
makes
f undefined
crash: after all the pattern"yo"
means('y':'o':[])
so it will forceundefined
, trying to match it against the first:
. More in detail, all the following calls will crash:Here
anything
can beundefined
or any other string/char as needed.By comparison, all of the following calls will return
0
since the first pattern in the definition fails its match (without crashing!):Again,
anything
can beundefined
or any other string/char as needed.This is because the pattern matching of
"yo"
is done roughly like this:x
until WHNF (expose its first constructor)[]
, faily:ys
, evaluatey
until WHNFy
is another char than'y'
, faily
is'y'
, evaluateys
until WHNFz:zs
, evaluatez
until WHNFz
is another char than'o'
, failz
is'o'
, evaluatezs
until WHNF[]
, succeed!!h:hs
, failNote that in each "evaluate .. until WHNF" point we could crash (or get stuck in an infinite computation) beacuse of bottoms.
Essentially, pattern matching proceed left-to-right and stops, evaluating the input only as much as needed, and stopping as soon as the result (fail/success) is known. This will not necessarily force the full evaluation of the input. On failure, we do not even necessarily evaluate the input as deep as the pattern, if we discover an early failure point. This is indeed what happens when you write:
Now, view patterns follow the same principle. A pattern
undefined -> x
will not evaluateundefined
on the input sincex
does not need to know the result ofundefined
to succeed. Insteadundefined -> x:xs
,undefined -> []
, andundefined -> "yo"
do need to know the result, hence they will evaluate it as needed.About your examples:
Here,
head -> y
always succeeds. On its own, it could bindy
to a bottom value, but that's prevented by the leftmost_:_
pattern.Here,
head -> y
always succeeds. On its own, it will bindy
to a bottom value, and this actually happens if the input is[]
, but that will not force the input, so not crash is caused so far. After that, we try the leftmost_:_
pattern, which fails. Result: failure, but no crash.Again,
head -> y
always succeeds, and bindsy
to bottom (if the input is[]
). The pattern matching will succeed, and the result off''
is[ head [] ]
. We can take, e.g., the length of this list, but we can not print its contents without crashing.undefined -> "yo"
crashes, as explained above. Thex:_
pattern is never tried.Here we first match
x:_
and only when that succeeds we tryundefined -> "yo"
. Since we callf
with[]
, the view pattern is not tried, so it does not crash. Callingf "a"
would instead matchx:_
, try the view pattern and crash.