regex match all after a string with positive lookbehind and input it behind every selection

1.3k Views Asked by At

copyright: hololive hololive_english
character: mori_calliope takanashi_kiara takanashi_kiara_(phoenix)
artist: xu_chin-wen
species:
meta: web

I want to select every word after eg:character: so i can put eg:character: behind every selection,

character:mori_calliope character:takanashi_kiara chararcter:takanashi_kiara_(phoenix)

the closest thing i got is

(?<=(\w*):\s*\S*\s.*)(?<=\s)(?=\S)

which works properly but it breaks when there is a single entry on eg:character: something or when its empty

i would be really thankfull if someone would help

1

There are 1 best solutions below

0
On

You should install PyPi regex module and use

regex.sub(r'(?<=(\w+):.*)(?<=\s)(?=\S)', r'\1:', text)
# or
# regex.sub(r'(?<=(\w+:).*)(?<=\s)(?=\S)', r'\1', text)

See the regex demo. Details:

  • (?<=(\w+):.*) - a positive lookbehind that matches a location that is immediately preceded with any word (captured into Group 1) followed by a : char and then any zero or more chars other than line break chars as many as possible (?<=\s)` - a positive lookbehind that matches a location that is immediately preceded with a whitespace char
  • (?=\S) - a positive lookahead that matches a location that is immediately followed with a non-whitespace char.

See the Python demo:

import regex
text = "copyright: hololive hololive_english\ncharacter: mori_calliope takanashi_kiara takanashi_kiara_(phoenix)\nartist: xu_chin-wen\nspecies:\nmeta: web"
print( regex.sub(r'(?<=(\w+):.*)(?<=\s)(?=\S)', r'\1:', text) )

Output:

copyright: copyright:hololive copyright:hololive_english
character: character:mori_calliope character:takanashi_kiara character:takanashi_kiara_(phoenix)
artist: artist:xu_chin-wen
species:
meta: meta:web