"string" Implementation in Text.Parser.Char

247 Views Asked by At

First, just some quick context. I'm going through the Haskell Programming From First Principles book, and ran into the following exercise.

Try writing a Parser that does what string does, but using char.

I couldn't figure it out, so I checked out the source for the implementation. I'm currently trying to wrap my head around it. Here it is:

class Parsing m => CharParsing m where
    -- etc.
    string :: CharParsing m => String -> m String
    string s = s <$ try (traverse_ char s) <?> show s

My questions are as follows, from most to least specific.

  1. Why is show necessary?

  2. Why is s <$ necessary? Doesn't traverse char s <?> s work the same? In other words, why do we throw away the results of the traversal?

  3. What is going on with the traversal? I get what a list traversal does, so I guess I'm confused about the Applicative/Monad instances for Parser. On a high level, I get that the traversal applies char, which has type CharParsing m => Char -> m Char, to every character in string s, and then collects all the results into something of type Parser [Char]. So the types make sense, but I have no idea what's going on in the background.

Thanks in advance!

1

There are 1 best solutions below

0
On

1) Why is show necessary?

Because showing a string (or a Text, etc.) escapes special characters, which makes sense for error messages:

GHCi> import Text.Parsec -- Simulating your scenario with Parsec.
GHCi> runParser ((\s -> s <$ try (traverse_ char s) <?> s) "foo\nbar") () "" "foo"
Left (line 1, column 4):
unexpected end of input
expecting foo
bar
GHCi> runParser ((\s -> s <$ try (traverse_ char s) <?> show s) "foo\nbar") () "" "foo"
Left (line 1, column 4):
unexpected end of input
expecting "foo\nbar"

2) Why is s <$ necessary? Doesn't traverse char s <?> s work the same? In other words, why do we throw away the results of the traversal?

The result of the parse is unnecessary because we know in advance that it would be s (if the parse were successful). traverse would needlessly reconstruct s from the results of parsing each individual character. In general, if the results are not needed it is a good idea to use traverse_ (which just combines the effects, discarding the results without trying to rebuild the data structure) rather than traverse, so that is likely why the function is written the way it is.

3) What is going on with the traversal?

traverse_ char s (traverse_, and not traverse, as explained above) is a parser. It tries to parse, in order, each character in s, while discarding the results, and it is built by sequencing parsers for each character in s. It may be helpful to remind that traverse_ is just a fold which uses (*>):

-- Slightly paraphrasing the definition in Data.Foldable:
traverse_ :: (Foldable t, Applicative f) => (a -> f b) -> t a -> f ()
traverse_ f = foldr (\x u -> f x *> u) (pure ())