I'm trying to write a small parser using the Sprache parser combinator library. The parser should be able to parse lines ended with a single \ as insignificant white space.
Question
How can I create a parser that can parse the values after the = sign that may contain a line-continuation character \?
For example
a = b\e,\
c,\
d
Should be parsed as (KeyValuePair (Key, 'a'), (Value, 'b\e, c, d')).
I'm new to using this library and parser combinators in general. So any pointers in the right direction are much appreciated.
What I have tried
Test
public class ConfigurationFileGrammerTest
{
[Theory]
[InlineData("x\\\n y", @"x y")]
public void ValueIsAnyStringMayContinuedAccrossLinesWithLineContinuation(
string input,
string expectedKey)
{
var key = ConfigurationFileGrammer.Value.Parse(input);
Assert.Equal(expectedKey, key);
}
}
Production
Attempt one public static readonly Parser<string> Value =
from leading in Parse.WhiteSpace.Many()
from rest in Parse.AnyChar.Except(Parse.Char('\\')).Many()
.Or(Parse.String("\\\n")
.Then(chs => Parse.Return(chs))).Or(Parse.AnyChar.Except(Parse.LineEnd).Many())
select new string(rest.ToArray()).TrimEnd();
Test output
Xunit.Sdk.EqualException: Assert.Equal() Failure
↓ (pos 1)
Expected: x y
Actual: x\
↑ (pos 1)
Attempt two
public static readonly Parser<string> SingleLineValue =
from leading in Parse.WhiteSpace.Many()
from rest in Parse.AnyChar.Many().Where(chs => chs.Count() < 2 || !(string.Join(string.Empty, chs.Reverse().Take(2)).Equals("\\\n")))
select new string(rest.ToArray()).TrimEnd();
public static readonly Parser<string> ContinuedValueLines =
from firsts in ContinuedValueLine.AtLeastOnce()
from last in SingleLineValue
select string.Join(" ", firsts) + " " + last;
public static readonly Parser<string> Value = SingleLineValue.Once().XOr(ContinuedValueLines.Once()).Select(s => string.Join(" ", s));
Test output
Xunit.Sdk.EqualException: Assert.Equal() Failure
↓ (pos 1)
Expected: x y
Actual: x\\n y
↑ (pos 1)
You must not include line continuation in the output. That's the only issue of the last unit test. When you parse the continuation
\\\nyou must drop it from the output result and return the empty string. Sorry I don't know how to do that using C# sprache. Maybe with something like that:I solved the problem using combinatorix python library. It's a parser combinator library. The API use functions instead of the using chained methods but the idea is the same.
Here is the full code with comments: