Regular Expressions - Select the Second Match

470 Views Asked by Commentator At 08 June 2017 at 22:13

I have a txt file with  and  between words that I would like to remove using Editpad

For example, I'd like to keep when it's like this:

<i>Phrases and words.</i>

And I'd like to remove the  and  tags inside the phrase, when it's like this:

<i>Phrases</i>and<i> words.</i>
<i>Phrases</i>and <i>words.</i>

I was trying to do that using regex, but I couldn't do it.

As the tag is followed by space or a word character I could find when the line has the double tag with

/ <i>|<\/i> /

but this way I can't just press replace for nothing, I have to edit line by line I search.

There's anyway to accomplish that?

* Edited *

Another example of lines found on the subtitle text

<i>- find me on the chamber.</i>
- What? <i>Go. Go, go, go!</i>

Original Q&A

There are 1 best solutions below

AudioBubble On 08 June 2017 at 23:53 BEST ANSWER

Rule number one: you can't parse html with regex.

That being said, if you know each line follows a certain pattern, you can usually hack something together to work. ;)

If I've understood correctly, it looks like you can simply remove all  and  that aren't either at the beginning or end of the lines. In that case, one method you could try is the following regex:

(?<=.)\<\/?i\>(?=.)

This will match the tags, with a lookahead and behind to make sure that we aren't at the end/start of a line (by checking if another character exists in front/behind. (Note that typically matched characters in a lookahead/behind won't be replaced when you search/replace.)

Disclaimer: this works on regex101, but notepad++ may have some differences to the pcre regex style.

update to work with Editpad

EDIT: since this question is actually wanting to know how to do this in Editpad, below is a modified alternative:

Try searching for the regex: (.)\<\/?i\>(.). This will match (and capture) exactly one character before and after the  tags.

When replacing, use backreferences to replace the entire match with the two captured characters - a replacement string of \1\2 should work.

Regular Expressions - Select the Second Match

* Edited *

There are 1 best solutions below

update to work with Editpad

Related Questions in REGEX

Related Questions in REPLACE

Related Questions in EDITPAD

Trending Questions

Popular # Hahtags

Popular Questions