Replace every symbol of the word after delimiter using python re

92 Views Asked by At

I would like to replace every symbol of a word after - with *.

For example:

asd-wqe ffvrf    =>    asd-*** ffvrf

In TS regex it could be done with (?<=-\w*)\w and replacement *. But default python regex engine requires lookbehinds of fixed width.

Best I can imaging is to use

(?:(?<=-)|(?<=-\w)|(?<=-\w{2}))\w

and repeat lookbehing some predetermined big number of times, but it seems not very sustainable or elegant.

Is it possible to use default re module for such a task with some more elegant pattern?

Demo for testing here.

P.S. I'm aware that alternative regex engines, that support lookbehind of variable length exist, but would like to stick with default one for a moment if possible.

2

There are 2 best solutions below

1
The fourth bird On BEST ANSWER

I think you can not do that with Python re, as you want to match a single character knowing that to the left is - followed by optional word characters.

I would write it like this with a callback and then get the length of the match for the replacement of the * chars

import re

strings = [
    "asd-wqe ffvrf",
    "asd-ss sd",
    "a-word",
    "a-verylongword",
    "an-extremelyverylongword"
]
pattern = r"(?<=-)\w+"
for s in strings:
    print(re.sub(pattern, lambda x: len(x.group()) * "*", s))

Output

asd-*** ffvrf
asd-** sd
a-****
a-************
an-*********************

See a python demo.


An alternative to a quantifier in a lookbehind assertion is using the \G anchor (which is also not supported by Python re)

 (?:-|\G(?!^))\K\w

Regex demo

2
Unmitigated On

You can capture all the word characters after - and pass a callback to re.sub that replaces the match with a string of asterisks of the same length.

s = 'asd-wqe ffvrf'
res = re.sub(r'(?<=-)\w+', lambda m: '*' * len(m.group()), s)