Parslet double negation

180 Views Asked by At

When parsing quotes and escapes (cf. Why does Parslet (in Ruby) return an empty array when parsing an empty string literal?) I came across an oddity in Parslet: (escape_char.absent? >> str('"')).absent? >> any It seems that Parslet actually resolves the double negation and expects the escape character to be there.

require 'parslet'
require 'rspec'
require 'parslet/rig/rspec'
require 'parslet/convenience'

class Parser < Parslet::Parser
  root(:quote)

  rule :quote do
    quote >> text >> quote
  end

  rule :text do
    (quote.absent? >> any).repeat
  end

  rule :quote do
    escape_char.absent? >> str('"')
  end

  rule :escape_char do
    str('\\')
  end
end

describe Parser do
  it 'should parse text in quotes' do
    is_expected.to parse('"hello"')
  end

  it 'should parse text in quotes with escaped quote' do
    is_expected.to parse('"foo\"bar"')
  end

  it 'should parse text in quotes with trailing escaped quote' do
    is_expected.to parse('"text\""')
  end
end

I am not so much interested in how to solve this, as it is already described in the Post linked above, but merely curious to understand this behaviour. It seems counterintuitive at first but I am sure there is good reason behind this.

1

There are 1 best solutions below

0
On

Parslet builds parsers from composing smaller parsers.. that's the beauty of PEG.

"Absent" is a parser, that takes a parser. It attempts to match the input stream against the wrapped parser. If the wrapped parser matches, Absent reports "no match". If the internal parser fails to match, the "Absent" parser passes.

So, the parser you mentioned : (escape_char.absent? >> str('"')).absent? >> any

will match a single character, but only when (escape_char.absent? >> str('"')) fails to match that same character.

(escape_char.absent? >> str('"')) will only fail to match if the first character is an escape character or isn't a quote.

Testing this, it turns out to be true.

require 'parslet'
require 'rspec'
require 'parslet/rig/rspec'
require 'parslet/convenience'

class Parser < Parslet::Parser
  root(:x)

  rule :x do 
    (escape_char.absent? >> str('"')).absent? >> any 
  end

  rule :escape_char do
    str('\\')
  end
end

  begin
    Parser.new.parse('a') # passes
    Parser.new.parse('b') # passes
    Parser.new.parse('\\') # passes
    Parser.new.parse('"') # << this one fails
    puts "pass"
  rescue Parslet::ParseFailed => error
    puts error.cause.ascii_tree
  end