When parsing quotes and escapes (cf. Why does Parslet (in Ruby) return an empty array when parsing an empty string literal?) I came across an oddity in Parslet: (escape_char.absent? >> str('"')).absent? >> any
It seems that Parslet actually resolves the double negation and expects the escape character to be there.
require 'parslet'
require 'rspec'
require 'parslet/rig/rspec'
require 'parslet/convenience'
class Parser < Parslet::Parser
root(:quote)
rule :quote do
quote >> text >> quote
end
rule :text do
(quote.absent? >> any).repeat
end
rule :quote do
escape_char.absent? >> str('"')
end
rule :escape_char do
str('\\')
end
end
describe Parser do
it 'should parse text in quotes' do
is_expected.to parse('"hello"')
end
it 'should parse text in quotes with escaped quote' do
is_expected.to parse('"foo\"bar"')
end
it 'should parse text in quotes with trailing escaped quote' do
is_expected.to parse('"text\""')
end
end
I am not so much interested in how to solve this, as it is already described in the Post linked above, but merely curious to understand this behaviour. It seems counterintuitive at first but I am sure there is good reason behind this.
Parslet builds parsers from composing smaller parsers.. that's the beauty of PEG.
"Absent" is a parser, that takes a parser. It attempts to match the input stream against the wrapped parser. If the wrapped parser matches, Absent reports "no match". If the internal parser fails to match, the "Absent" parser passes.
So, the parser you mentioned :
(escape_char.absent? >> str('"')).absent? >> any
will match a single character, but only when
(escape_char.absent? >> str('"'))
fails to match that same character.(escape_char.absent? >> str('"'))
will only fail to match if the first character is an escape character or isn't a quote.Testing this, it turns out to be true.