Select-String: match a string only if it isn't preceded by * character

132 Views Asked by At

I have this line in a Powershell script:

if (Select-String -Path "$holidays" -Pattern "(?<!\*)$datestr" -SimpleMatch -Quiet)

$holidays is a text file where i have some date:

2023-09-04
2023-11-23
*2023-11-24
2023-12-25

$datestr is today date: 2023-11-23

why negative lookbehind pattern doesn't work in the line above?

I just want to esclude (don't match) all dates that start with *

I think that something is wrong in negative lookbehind because the line:

if (Select-String -Path "$holidays" -Pattern $datestr -SimpleMatch -Quiet)

works perfectly --> results is true

in the first line results is false

2

There are 2 best solutions below

0
On BEST ANSWER

As explained in comments, when -SimpleMatch is used the cmdlet doesn't use regex:

Indicates that the cmdlet uses a simple match rather than a regular expression match. In a simple match, Select-String searches the input for the text in the Pattern parameter. It doesn't interpret the value of the Pattern parameter as a regular expression statement.

In your case the cmdlet is looking for a literal (?<!\*)2023-11-23 instead of 2023-11-23 not preceded by *:

$datestr = '2023-11-23'
'2023-11-23' | Select-String -Pattern "(?<!\*)$datestr" -SimpleMatch -Quiet # No output
'2023-11-23' | Select-String -Pattern "(?<!\*)$datestr" -Quiet              # True

In summary, your condition should be:

if (Select-String -Path $holidays -Pattern "(?<!\*)$datestr" -Quiet) {
    # do stuff
}
1
On

Let me complement Santiago Squarzon's helpful answer with some background information:

  • Select-String interprets its -Pattern argument(s) as regexes by default.

  • It is only if you want literal substring matching that you need -SimpleMatch.

  • In either case, matching is case-insensitive by default (as PowerShell generally is); use
    -CaseSensitive as needed.

  • -SimpleMatch is not be confused with these other switches - which are unrelated to whether -Pattern arguments are interpreted as regexes or literal strings:

    • -Raw - available in PowerShell (Core) 7+ only - emits matching input strings (lines) as-is instead of the default behavior of reporting them wrapped in Microsoft.PowerShell.Commands.MatchInfo instances, which supplement a matching input string with metadata about the match.

    • -AllMatches (incompatible with -Raw and -SimpleMatch) requests that, on each input string (line), all matches for the pattern(s) be looked for instead of just the first.

      • Note: -Raw emits strings only, so by definition no information about which parts of each string matched can be reported; similarly, -SimpleMatch records no such information, even though with multiple -Pattern arguments it may be of interest; also, unfortunately, as of v7.4.0, combining -SimpleMatch with -AllMatches is quietly accepted, even though -AllMatches then has no effect - see GitHub issue #11091

      • The .Matches property of MatchInfo objects is a collection of System.Text.RegularExpressions.Match objects precisely in order to support -AllMatches. Without the latter, the .Matches collection only ever has one entry.

    • -List stops processing further input strings once a string with match(es) is found, and outputs a MatchInfo instance (or a [string] instance with -Raw) only for that input, i.e. at most one per input file - this can be an important performance improvement.

    • -Quiet acts like -List, except that instead of emitting a MatchInfo instance it emits $true, i.e. an abstract indicator that a match was found.

      • Note: While it stands to reason that $false would be emitted if no matches are found, this is not the case as of PowerShell (Core) 7.4.0; instead, there is no output; see GitHub issue#16681.
        That said, in an implied Boolean context, no output is tantamount to $false - see the conceptual about_Booleans help topic