I am struggling with Parsec to parse a small subset of the Google project wiki syntax, and convert it into HTML. My syntax is limited to text sequences and item lists. Here is an example of what I want to recognize:
Text that can contain any kind of characters,
except the string "\n *"
 * list item 1
 * list item 2
End of list 
My code so far is:
import Text.Blaze.Html5 (Html, toHtml)
import qualified Text.Blaze.Html5 as H
import Text.ParserCombinators.Parsec hiding (spaces)
parseList :: Parser Html
parseList = do
    items <- many1 parseItem
    return $ H.ul $ sequence_ items
parseItem :: Parser Html
parseItem = do
    string "\n *"
    item <- manyTill anyChar $
        (try $ lookAhead $ string "\n *") <|>
        (try $ string "\n\n")
    return $ H.li $ toHtml item
parseText :: Parser Html
parseText = do
    text <- manyTill anyChar $
        (try $ lookAhead $ string "\n *") <|>
        (eof >> (string ""))
    return $ toHtml text
parseAll :: Parser Html
parseAll = do
    l <- many (parseUl <|> parseText)
    return $ H.html $ sequence_ l
When applying parseAll to any sequence of characters, I get the following error message: "*** Exception: Text.ParserCombinators.Parsec.Prim.many: combinator 'many' is applied to a parser that accepts an empty string.
I understand that it is because my parser parseText can read empty strings, but I can't see any other way. How can I recognize text delimited by a string? ("\n *" here).
I am also open to any remarks or suggestions concerning the way I am using Parsec. I can't help but see that my code is a bit ugly. Can I do all this in a simpler way? For example, there is code replication (which is kind of painful) because of the string "\n *", that is used to recognize the end of a text sequence, the beginning of a list item, AND the end of a list item...
 
                        
I removed the HTML stuff because for whatever reason I couldn't get
blaze-htmlto install on my machine. But in principle it should be essentially the same thing. This parses strings delimited by the string "\n *" and ended by the string "\n\n". I don't know if have a leading\nis what you want but that is easy to fix.Also, I don't know if the empty string is valid. You should change
sepBy1tosepByif it is.As for the error you were getting: you have
string ""inside ofmany. Not only does this give the error you got, it doesn't make any sense! The parserstring ""will always succeed without consuming anything, since the empty string is a prefix of all strings and"" ++ x == x. If you try to do this multiple times then you will never finish parsing.Besides all that, your
parseListshould parse your language. It essentially does the same thing thatsepBydoes. I just thinksepByis cleaner :)