I have written a following parsing code using attoparsec
:
data Test = Test {
a :: Int,
b :: Int
} deriving (Show)
testParser :: Parser Test
testParser = do
a <- decimal
tab
b <- decimal
return $ Test a b
tParser :: Parser [Test]
tParser = many' $ testParser <* endOfLine
This works fine for small sized files, I execute it like this:
main :: IO ()
main = do
text <- TL.readFile "./testFile"
let (Right a) = parseOnly (manyTill anyChar endOfLine *> tParser) text
print a
But when the size of the file is greater than 70MB, it consumes tons of memory. As a solution, I thought I would use attoparsec-conduit
. After going through their API, I'm not sure how to make them work together. My parser has the type Parser Test
but it's sinkParser
actually accepts parser of type Parser a b
. I'm interested in how to execute this parser in constant memory ? (A pipes based solution is also acceptable, but I'm not used to the Pipes API.)
The first type parameter to
Parser
is just the data type of the input (eitherText
orByteString
). You can provide yourtestParser
function as the argument tosinkParser
and it will work fine. Here's a short example: