Matching patterns for folder names in a path, excluding a chunk of the path from matching?

760 Views Asked by At

Assume an initial (Unix) path [segment] like /var/log. Underneath this path, there might be an entire tree of directories. A user provides a pattern for folder names using Unix shell-style wildcards, e.g. *var*. Folders following the pattern underneath the initial path [segment] shall be matched using a regular expression given a full path as input, i.e. the initial path segment must be excluded from matching.

How would I build a regular expression doing this?


I am working with Python, which offers the fnmatch module as part of its standard library. fnmatch provides a translate method, which translates patterns specified using Unix shell-style wildcards into regular expressions:

>>> fnmatch.translate('*var*')
'(?s:.*var.*)\\Z'

I would like to use this for constructing my regular expressions.

Matching input paths could look this this:

  • /var/log/foo/var/bar
  • /var/log/foo/avarb/bar
  • /var/log/var/

Not matching input paths could look like this:

  • /var/log
  • /var/log/foo/bar

The underlying issue is that I have to provide the regular expression to a third-party module, pyinotify, as input. I can not work around this by just stripping the initial path segment and then matching against the remainder ...

1

There are 1 best solutions below

0
On BEST ANSWER

You should be able to do a negative look behind like so:

(?<!^\/)var

Both positive and negative look behinds are really useful when doing regex. Also here is an interactive example so you can get a feel on how it works with visual feedback: https://regex101.com/r/52sZjw/1 another example https://regex101.com/r/F023eD/1/ Not exactly sure how you can use this with fnmatch. It really looks like you might end up building the strings yourself, that is when the users input will match part of the path you want to exclude.