font-lock-add-keywords regexp for C++ operators

567 Views Asked by At

I have a regexp that finds and highlights operators in C++ but I can't for the life of me work out how to match operator[] using regexp. The usual trick of escaping the characters doesn't seem to work, it merely ends matching.

    (font-lock-add-keywords
     nil '(
       ;; operators
       ("[~^&\|!<>:=,.\\+*/%-]" . font-lock-keyword-face) ))

My second (incomplete) attempt using regexp-builder and moving escaped symbols to the end of the match got me the opening brace:

    ("[~^=<>&/.,\[\|\*\+\-]" . font-lock-keyword-face)

but adding \] or moving \[ kills any matching. What am I missing?

3

There are 3 best solutions below

1
legoscia On BEST ANSWER

To match a literal ], put it right after the opening [:

"[]~^=<>&/.,\[\|\*\+\-]"

Since an empty character choice wouldn't make any sense in a regexp, this combination is given the alternative interpretation of matching an actual ].

3
AudioBubble On

You need two consecutive backslashes to escape a character for a regular expression in a literal string, i.e.

"[]~^&|!<>:=,.+\\[*/%-]"

A single backslash is interpreted by the Emacs Lisp reader when parsing the string literal, and denotes escape sequences, e.g. \n for a newline character. If the backslash doesn't start a known escape sequence as in this case (\[ is no special character), the backslash is simply dropped. In this case, a single [ ends up in the resulting string, which creates an invalid regular expression and thus prevents it from matching.

1
jpkotta On

I use regexp-opt for this, because it's much more readable. Relevant config:

(setq operators-font-lock-spec
      (cons (regexp-opt '("+" "-" "*" "/" "%" "!"
                          "&" "^" "~" "|"
                          "=" "<" ">"
                          "." "," ";" ":" "?"))
            (list
             0 ;; use whole match
             'font-lock-builtin-face
             'keep ;; OVERRIDE
             )))

(setq brackets-font-lock-spec
      (cons (regexp-opt '("(" ")" "[" "]" "{" "}"))
            (list
             0 ;; use whole match
             'font-lock-bracket-face
             'keep ;; OVERRIDE
             )))

(font-lock-add-keywords
 'c++-mode
 (list
  operators-font-lock-spec
  brackets-font-lock-spec
  (cons c-types-regexp 'font-lock-type-face)))

brackets-font-lock-spec is separate because I use a different face for them.