I am trying to parse mediawiki text using Parsec. Some of the constructs in mediawiki markup can only occur at the start of rows (such as the header markup ==header level 2==
). In regexp I would use an anchor (such as ^
) to find the start of a line.
One attempt in GHCi is
Prelude Text.Parsec> parse (char '\n' *> string "==" *> many1 letter <* string "==") "" "\n==hej=="
Right "hej"
but this is not too good since it will fail on the first line of a file. I feel like this should be a solved problem...
What is the most idiomatic "Start of line" parsing in Parsec?
You can use
getPosition
andsourceColumn
in order to find out the column number that the parser is currently looking at. The column number will be1
if the current position is at the start of a line (such as at the start of input or after a\n
or\r
character).There isn't a built-in combinator for this, but you can easily make it:
Now you can write your header parser as: