Implementing a lexer using the Free Monad

179 Views Asked by At

I am thinking about a use case of the free monad which would be a simple lexing DSL. So far I came up with some primitive operations:

data LexF r where
  POP  :: (Char -> r) -> LexF r
  PEEK :: (Char -> r) -> LexF r
  FAIL :: LexF r
  ...

instance Functor LexF where
  ...

type Lex = Free LexF

The problem I encounter is that I would like to have a CHOICE primitive that would describe an operation of trying to execute one parser and in case of failure fallback to another. Something like CHOICE :: LexF r -> LexF r -> (r -> r) -> LexF r...

...and here the stairs begin. Since r is preset at contravariant position, it is impossible (is it?) to create a valid Functor instance for Op. I came up with some other idea, which was to generalize over the type of alternative lexers, so CHOICE :: LexF a -> LexF a -> (a -> r) -> LexF r – now it works as a Functor, though the problem is with thawing it into Free, as I would normally do it with liftF:

choice :: OpF a -> OpF a -> OpF a
choice c1 c2 = liftF $ CHOICE _ _ id  -- how to fill the holes :: Op a ?

I am really running out of any ideas. This of course generalizes to nearly all other combinators, I just find CHOICE a good minimal case. How to tackle it? I am okay to hear that this example is totally broken and it just won't work with Free as I would like to. But therefore, does it even make sense to write lexers/parsers in this manner?

1

There are 1 best solutions below

0
On BEST ANSWER

As a general rule when working with free monads, you don't want to introduce primitives for "monadic control". For example, a SEQUENCE primitive would be ill-advised, because the free monad itself provides sequencing. Likewise, a CHOICE primitive is ill-advised because this should be provided by a free MonadPlus.

Now, there is no free MonadPlus in modern versions of free because equivalent functionality is provided by a free monad transformer over a list base monad, namely FreeT f []. So, what you probably want is to define:

data LexF r where
  POP  :: (Char -> r) -> LexF r
  PEEK :: (Char -> r) -> LexF r
deriving instance Functor LexF
type Lex = FreeT LexF []

pop :: (Char -> a) -> Lex a
pop f = liftF $ POP f

peek :: (Char -> a) -> Lex a
peek f = liftF $ PEEK f

but no FAIL or CHOICE primitives.

If you were to define fail and choice combinators, they would be defined by means of the list base monad using transformer magic:

fail :: Lex a
fail = empty

choice :: Lex a -> Lex a -> Lex a
choice = (<|>)

though there's no actual reason to define these.

SPOILERS follow... Anyway, you can now write things like:

anyChar :: Lex Char
anyChar = pop id

char :: Char -> Lex Char
char c = do
  c' <- anyChar
  guard $ c == c'
  return c'

a_or_b :: Lex Char
a_or_b = char 'a' <|> char 'b'

With an interpreter for your monad primitives, in this case intrepreting them to the StateT String [] AKA String -> [(a,String)] monad:

type Parser = StateT String []
runLex :: Lex a -> Parser a
runLex = iterTM go
  where go :: LexF (Parser a) -> Parser a
        go (POP f) = StateT pop' >>= f
          where pop' (c:cs) = [(c,cs)]
                pop' _      = []
        go (PEEK f) = StateT peek' >>= f
          where peek' (c:cs) = [(c,c:cs)]
                peek' _      = []
parse :: Lex a -> String -> [(a, String)]
parse = runStateT . runLex

you can then:

main :: IO ()
main = do
  let test = parse a_or_b
  print $ test "abc"
  print $ test "bca"
  print $ test "cde"

The full example:

{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE GADTs #-}
{-# OPTIONS_GHC -Wall #-}

import Control.Monad.State
import Control.Applicative
import Control.Monad.Trans.Free

data LexF r where
  POP  :: (Char -> r) -> LexF r
  PEEK :: (Char -> r) -> LexF r
deriving instance Functor LexF
type Lex = FreeT LexF []

pop :: (Char -> a) -> Lex a
pop f = liftF $ POP f

peek :: (Char -> a) -> Lex a
peek f = liftF $ PEEK f

anyChar :: Lex Char
anyChar = pop id

char :: Char -> Lex Char
char c = do
  c' <- anyChar
  guard $ c == c'
  return c'

a_or_b :: Lex Char
a_or_b = char 'a' <|> char 'b'

type Parser = StateT String []
runLex :: Lex a -> Parser a
runLex = iterTM go
  where go :: LexF (Parser a) -> Parser a
        go (POP f) = StateT pop' >>= f
          where pop' (c:cs) = [(c,cs)]
                pop' _      = []
        go (PEEK f) = StateT peek' >>= f
          where peek' (c:cs) = [(c,c:cs)]
                peek' _      = []
parse :: Lex a -> String -> [(a, String)]
parse = runStateT . runLex

main :: IO ()
main = do
  let test = parse a_or_b
  print $ test "abc"
  print $ test "bca"
  print $ test "cde"