How can I interpolate values into a string based on a key token using Parsec (Haskell)?

160 Views Asked by At

I'm new to the world of parsing, and have a fairly simple-seeming problem:

I have a long string comprised of Chunks of normal text, and Keys that are encoded like <<key-label>>.

data Merge a = Chunk a
             | Key a
  deriving (Show)

key :: Parser (Merge String)
key = Key <$> between (string "<<") (string ">>") (many1 letter)

chunk :: Parser (Merge String)
chunk = Chunk <$> many1 anyChar

prose = many1 $ key <|> chunk

ex = parseTest prose "hi <<x>> ! Do you like <<y>>?"

-- Returns: 
-- [Chunk "hi <<x>> ! Do you like <<y>>?"]

-- I'd like:
-- [Chunk "hi ", Key "x", Chunk " !", ...]

I'd like to replace those keys with values, but I can solve that if I can parse a string into my tokens, IE String -> [Merge].

I've dived into the boundless depths that is lexing/parsing, and while I hope to learn all of it eventually, any guidance on solving this problem now?

This is the simplest instantiation of my attempts, although I have tried separate passes over the data, including separate lexing/parsing steps, and I'd like to use parsec instead of a more concrete interpolation lib.

2

There are 2 best solutions below

1
On BEST ANSWER

You can use notFollowedBy to say that you want a chunk to include a character as long as it isn't a key. notFollowedBy doesn't consume input so prose will still go on to parse the key again as its own item.

chunk = Chunk <$> many1 (notFollowedBy key >> anyChar)

This will allow even things like aaa<<bbbbbb to be parsed as a chunk, by going all the way to the end of the file, not finding a closing >>, deciding that it must not have been a key and therefore it can be part of the chunk.

If you would rather have << always be the start of a key and fail if it isn't closed, disallow << from the chunk:

chunk = Chunk <$> many1 (notFollowedBy (string "<<") >> anyChar)
0
On

replace-megaparsec is a library for doing search-and-replace with parsers. The search-and-replace function is streamEdit.

import Replace.Megaparsec
import Text.Megaparsec
import Text.Megaparsec.Char
import Data.Char

key = between (string "<<") (string ">>") (many letterChar) :: Parsec Void String String
editor k = "Key " ++ (fmap toUpper k)

streamEdit key editor "hi <<x>> ! Do you like <<y>>?"
"hi Key X ! Do you like Key Y?"

You can also get the intermediate separated strings with the sepCap parser combinator, which returns a structure equivalent to the [Merge] that you were trying to build.