Semgrep: A scalable way of catching all cases in a multiline f-strings

409 Views Asked by At

I have some logs in my codebase that have multiline f-strings, such as:

...
logger.error(
    f'...'
    f'...'
    f'...'
    f'...'
    f'...'
    f'...'
)

And some only have two f'...'s on separate lines while others 3 f'...'s, and so on.

I am currently duplicating patterns to catch such logs. For example:

...
patterns:
  - pattern-either:
     - pattern: |
        logger.$METHOD(
          f'...'
          f'...'
          f'...'
         )
     - pattern: |
        logger.$METHOD(
          f'...'
          f'...'
          f'...'
          f'...'
          f'...'
        )

Catch those with 3 and 5 f'...'s on multiple lines. I have to write another pattern for those with 4, 2 and so on.

Is there a scalable way to capture all of these with fewer patterns? The current implementation won't scale as there might be logs with 6, 7, 8, 9 and so on multiline f-strings.

1

There are 1 best solutions below

0
Sirjon On BEST ANSWER

A good solution has been posted here and a live demo here.

Basically, instead of duplicating the lines, a metavariable, $X was created to represent the message. In case the message doesn't match '...', it is flagged as a suspect. The full code is:

rules:
  - id: test
    patterns:
      - pattern-either:
          - pattern: logger.$METHOD(..., $X, ...)
          - pattern: logger.$METHOD(..., message=$X, ...)
          - pattern: logger.$METHOD(..., msg=$X, ...)
      - metavariable-pattern:
          metavariable: $X
          patterns:
            - pattern-not: |
                "..."
    message: Semgrep found a match $X
    languages:
      - python
    severity: WARNING

All credits to lagoAbal.