lua lpeg expression to not sub in between delimeters

121 Views Asked by At

I would like to understand how I could lpeg to replace strings if they are NOT between a certain start and end delimiter. Below is an example, where I would like to use SKIPstart and SKIPstop to signify where text shouldn't be replaced.

rep
rep
SKIPstart
rep
rep
SKIPstop
rep
rep

to

new
new
SKIPstart
rep
rep
SKIPstop
new
new

Here would be another example with multiple delimiters:

rep
rep
SKIPstart
rep
rep
SKIPstop
rep
rep
SKIPstart
rep
rep
SKIPstop

to

new
new
SKIPstart
rep
rep
SKIPstop
new
new
SKIPstart
rep
rep
SKIPstop

and nested

rep
rep
SKIPstart
rep
SKIPstart
rep
SKIPstop
rep
SKIPstop
rep
rep

to

new
new
SKIPstart
rep
SKIPstart
rep
SKIPstop
rep
SKIPstop
new
new
2

There are 2 best solutions below

9
On BEST ANSWER

Sorry, I don't know lpeg, but your task is easily solvable with usual Lua patterns.
IMO, lpeg or other external regex libraries are overkill in most cases, Lua patterns are surprisingly good enough.

local s = [[
rep
rep
SKIPstart
rep
rep
SKIPstop
rep
rep
SKIPstart
rep
SKIPstart
rep
SKIPstop
rep
SKIPstop
rep
rep
]]
s = s:gsub("SKIPstart", "\1%0")
     :gsub("SKIPstop", "%0\2")
     :gsub("%b\1\2", "\0%0\0")
     :gsub("(%Z*)%z?(%Z*)%z?",
         function(a, b) return a:gsub("rep", "new")..b:gsub("[\1\2]", "") end)
print(s)

Output:

new
new
SKIPstart
rep
rep
SKIPstop
new
new
SKIPstart
rep
SKIPstart
rep
SKIPstop
rep
SKIPstop
new
new
2
On

Egor Skriptunoff's answer is a great way of playing tricks with standard lua patterns to achieve your goal. I agree that if a straightforward way can work, I won't recommend using LPeg or other external libraries.

As you asked about LPeg, I'll show you how you can do it with LPeg.

local re = require('lpeg.re')

local defs = {
  do_rep = function(p)
    return p:gsub('rep', 'new')
  end
}

local pat = re.compile([=[--lpeg
  all <- {~ ( (!delimited . [^S]*)+ -> do_rep / delimited )* ~}
  delimited <- s (!s !e . / delimited)* e
  s <- 'SKIPstart'
  e <- 'SKIPstop'
]=], defs)

local s = [[
rep
rep
SKIPstart
rep
rep
SKIPstop
rep
rep
SKIPstart
rep
SKIPstart
rep
SKIPstop
rep
SKIPstop
rep
rep
]]

s = pat:match(s)
print(s)