I am having a XML similar to this
<Level1Node>
.
.
<Level2Node val="Retain"/>
.
.
</Level1Node>
<Level1Node>
.
.
<Level2Node val="Replace"/>
.
.
</Level1Node>
<Level1Node>
.
.
<Level2Node val="Retain"/>
.
.
</Level1Node>
I need to remove only the below node,
<Level1Node>
.
.
<Level2Node val="Replace"/>
.
.
</Level1Node>
To have it replaced in non-greedy manner, I used the below regex,
perl -0 -pe "s|<Level1Node>.*?<Level2Node val="Retain"/>.*?</Level1Node>||gs" myxmlfile
But the non-geedy terminates the match only at the end of the pattern, not at the start. How to get it started at the last match of <Level1Node>
You will need to use a negative lookahead to make sure you do not match closing
Level1Nodetags where you don't want to:Details:
?:is only here so that the parenthesis are not interpreter as a capturing group.If you plan to run this on a large file, you should probably check the cost of the negative lookahead, it might be high.