[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
regexp font-lock highlighting
From: |
martin rudalics |
Subject: |
regexp font-lock highlighting |
Date: |
Mon, 30 May 2005 10:41:25 +0200 |
User-agent: |
Mozilla Thunderbird 1.0 (Windows/20041206) |
The recent modification of `lisp-font-lock-keywords-2' to highlight
subexpressions of regexps has two minor bugs:
(1) If you attempt to write the regexp to match the string "\\)" as
"\\\\\\\\)" the last three chars of that regexp are highlighted with
`font-lock-comment-face'.
(2) If the region enclosed by the arguments START and END of
`font-lock-fontify-keywords-region' contains one of "\\(", "\\|",
"\\)" within a comment, doc-string, or key definition, all
subsequent occurrences within a normal string are _not_ highlighted.
`font-lock-fontify-keywords-region' goes to START when it evaluates
your lambda, decides that the expression should not get highlighted
since it has the wrong face, and wrongly concludes that no such
expression exists up to END.
The following lambda should avoid these problems:
((lambda (bound)
(catch 'found
(while (re-search-forward
"\\(\\\\\\\\\\)\\(?:\\(\\\\\\\\\\)\\|\\([(|)]\\)\\(\\?:\\)?\\)" bound t)
(unless (match-beginning 2)
(let ((face (get-text-property (1- (point)) 'face)))
(when (or (and (listp face)
(memq 'font-lock-string-face face))
(eq 'font-lock-string-face face))
(throw 'found t)))))))
;; Should we introduce a lowlight face for this?
;; Ideally that would retain the color, dimmed.
(1 'font-lock-comment-face prepend)
(3 'bold prepend)
(4 font-lock-type-face prepend t))
Moreover I don't think that anything is "broken" in the following:
;; Underline innermost grouping, so that you can more easily see what
;; belongs together. 2005-05-12: Font-lock can go into an
;; unbreakable endless loop on this -- something's broken.
;;("[\\][\\][(]\\(?:\\?:\\)?\\(\\(?:[^\\\"]+\\|[\\]\\(?:[^\\]\\|[\\][^(]\\)\\)+?\\)[\\][\\][)]"
;;1 'underline prepend)
I believe that `font-lock-fontify-keywords-region' starts backtracking
and this can take hours in more complicated cases. Anyway, regexps are
not suited to handle this. If you are willing to pay for two additional
buffer-local variables such as
(defvar regexp-left-paren nil
"Position of innermost unmatched \"\\\\(\".
The value of this variable is valid iff `regexp-left-paren-end' equals the upper
bound of the region `font-lock-fontify-keywords-region' currently
investigates.")
(make-variable-buffer-local 'regexp-left-paren)
(defvar regexp-left-paren-end 0
"Buffer position indicating whether the value of `regexp-left-paren' is valid.
If the value of this variable equals the value of the upper bound of the region
investigated by `font-lock-fontify-keywords-region' the current value of
`regexp-left-paren' is valid.")
(make-variable-buffer-local 'regexp-left-paren-end)
the following modification of the above lambda expression should handle
this problem:
((lambda (bound)
(catch 'found
(while (re-search-forward
"\\(\\\\\\\\\\)\\(?:\\(\\\\\\\\\\)\\|\\(\\((\\)\\|\\(|\\)\\|\\()\\)\\)\\)"
bound t)
(when (match-beginning 3)
(let ((face (get-text-property (1- (point)) 'face))
match-data-length)
(when (or (and (listp face)
(memq 'font-lock-string-face face))
(eq 'font-lock-string-face face))
(cond
((match-beginning 4) ; \\(
(setq regexp-left-paren (match-end 4))
(setq regexp-left-paren-end bound)
(set-match-data
(append (butlast (match-data) 2)
(list (point-min-marker) (point-min-marker)))))
((match-beginning 5) ; \\|
(set-match-data
(append (butlast (match-data) 4)
(list (point-min-marker) (point-min-marker)))))
((match-beginning 6) ; \\)
(set-match-data
(append (butlast (match-data) 6)
(if (= regexp-left-paren-end bound)
(list (copy-marker regexp-left-paren)
(match-beginning 6))
(list (point-min-marker) (point-min-marker)))))
(setq regexp-left-paren nil)
(setq regexp-left-paren-end 0)))
(throw 'found t)))))))
;; Should we introduce a lowlight face for this?
;; Ideally that would retain the color, dimmed.
(1 'font-lock-comment-face prepend)
(3 'bold prepend)
(4 'underline prepend))
I have tried this on some elisp files which had the original solution
choke and did not encounter any problems. Note that I removed the
"\\(\\?:\\)?" since I find it distracting to put yet another face here.
If you believe that you _really_ need it you will have to reinsert it,
but in that case you have to modify match-data cropping as well. (I do
have to modify match-data since redisplay wants some valid buffer
positions for highlighting.)
Finally, I would use three distinct font-lock faces for regexps:
- One face for highlighting the "\\"s which by default should inherit
from `font-lock-string-face' with a dimmed foreground - I'm using
Green4 for strings and PaleGreen3 for the "\\"s. Anyone who doesn't
like the highlighting could revert to `font-lock-string-face'.
- One face for highlighting the "(", "|" and ")" in these expressions.
I find `bold' good here but again would leave it to the user whether
she wants to turn off highlighting this. Moreover, such a face could
allow paren-highlighting to _never_ match a paren with that face with
a paren with another face. Consequently, paren-matching could finally
provide more trustable information within regular expressions.
- One face for highlighting the innermost grouping. Basically,
`underline' is not bad here but appears a bit noisy in multiline
expressions or things like
(concat "\\("
some-string
"\\)")
I'm using a background which is slightly darker than the default
background and gives regular expressions a very distinguished
appearance. Anyway, users should be allowed to turn highlighting off
by using the default face.
- regexp font-lock highlighting,
martin rudalics <=