I have a string that looks like this:
**** SOURCE#24 ****
[1] Source Location [Local/Remote] : Remote
Remote Host Name : PNQ
User Name : foo
[2] Source directory : HDPGWRF
[3] Directory poll interval : 30
[4] File name format : ACR_FILEFORMAT
[5] Delete files from source : y
**** SOURCE#25 ****
[1] Source Location [Local/Remote] : Remote
Remote Host Name : PNR
User Name : foo
[2] Source directory : HDPGWRF
[3] Directory poll interval : 30
[4] File name format : ACR_FILEFORMAT
[5] Delete files from source : y
**** SOURCE#26 ****
etc.....
I want a capture group that captures everything after the '[1]' up to the ends of the line that starts with [5], based on the Remote Host Name (eg PNR or PNQ). So only lines [1] through [5] around the selected name.
I've been trying lookahead and lookbehind and just can't figure this out. It looks like lookbehind is greedy, so if I search for the PNR section, it won't stop at the first [1] but grabs everything up to the first [1] in the PNQ section.
This is the closest I've got to making it work, but it only works if I search for the PNQ section:
re.search('SOURCE#.*?\[1\](.*?PNQ.*?.*?HDPGWRF.*?)\*', buf, flags=re.DOTALL).group(1)
This after combing through stackoverflow all afternoon :(
You might use a pattern without the
re.DOTALLflag but with there.MULTILINEflag:The pattern matches:
\bSOURCE#Match literally starting with a word boundary.*Match the rest of the line\s*^Match optional whitspace chars until a start of the line\[1]That matches[1](Capture group 1.*match the rest of the line\s*^Match optional whitspace chars until a start of the line(?!\[\d+])Negative lookahead, assert that the lines does not start with[digits].*\bPN[QR]MatchPNBorPNQat the end of the line(?:\n(?!\[\d+]).*)*Match all following lines that do not start with[digits](?:\n\[\d+].*)*Match all following lines that do start with[digits])Close group 1See a regex demo and a demo that will not over match using only PNR and a Python demo