I need to parse a file containing xml comments. Specifically it's a c# file using the MS ///
convention.
From this I'd need to pull out foobar
, or /// foobar
would be acceptable, too. (Note - this still doesn't work if you make the xml all on one line...)
testStr = """
///<summary>
/// foobar
///</summary>
"""
Here is what I have:
import pyparsing as pp
_eol = pp.Literal("\n").suppress()
_cPoundOpenXmlComment = Suppress('///<summary>') + pp.SkipTo(_eol)
_cPoundCloseXmlComment = Suppress('///</summary>') + pp.SkipTo(_eol)
_xmlCommentTxt = ~_cPoundCloseXmlComment + pp.SkipTo(_eol)
xmlComment = _cPoundOpenXmlComment + pp.OneOrMore(_xmlCommentTxt) + _cPoundCloseXmlComment
match = xmlComment.scanString(testStr)
and to output:
for item,start,stop in match:
for entry in item:
print(entry)
But I haven't had much success with the grammer working across multi-line.
(note - I tested the above sample in python 3.2; it works but (per my issue) does not print any values)
Thanks!
How about using
nestedExpr
: