regex to replace console code with whitespaces

107 Views Asked by At

I'm writing some Python tests for a console application that uses console codes, and I'm having some problem gracefully handling the ESC H sequence.

I have the s = r'\x1b[12;5H\nSomething' input string, I'd like to replace it with Something. I'm trying to use the following regex:

re.sub(r'\x1b\[([0-9,A-Z]{1,2};([0-9]{1,2})H)', r'\2', s)

Which of course creates 5Something.

What I want is something to the effect of

re.sub(r'\x1b\[([0-9,A-Z]{1,2};([0-9]{1,2})H)', ' '*(int(r'\2')-1), s)

Which is to create one less than the number of spaces of the second capture group.

I'd also be very happy if there was a way to simply render in a string what I get when I use print(s):

    Something

I'm using Python 3.

Thanks a lot!!

1

There are 1 best solutions below

0
On BEST ANSWER

Use

import re
s = r'\x1b[12;5H\nSomething'
pattern = r'\\x1b\[[0-9A-Z]{1,2};([0-9]{1,2})H\\n'
print(re.sub(pattern, lambda x: ' '*(int(x.group(1))-1), s))

See Python proof. See a regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  \\                       '\'
--------------------------------------------------------------------------------
  x1b                      'x1b'
--------------------------------------------------------------------------------
  \[                       '['
--------------------------------------------------------------------------------
  [0-9A-Z]{1,2}            any character of: '0' to '9', 'A' to 'Z'
                           (between 1 and 2 times (matching the most
                           amount possible))
--------------------------------------------------------------------------------
  ;                        ';'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [0-9]{1,2}               any character of: '0' to '9' (between 1
                             and 2 times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  H                        'H'
--------------------------------------------------------------------------------
  \\                       '\'
--------------------------------------------------------------------------------
  n                        'n'