bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#67262: python-ts-mode cannot identify triple-quoted-strings


From: Dmitry Gutov
Subject: bug#67262: python-ts-mode cannot identify triple-quoted-strings
Date: Sun, 26 Nov 2023 04:04:07 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 25/11/2023 16:42, JD Smith wrote:
Bridging emacs syntax to treesitter in a robust way seems like it could be a 
subtle enterprise, so I’d prefer to leave that to one of the experts.  Right 
now the syntax-propertize-function in python-mode does one simple thing: ensure 
triple quotes are properly marked as strings.  Since the treesitter grammar 
doesn’t distinguish between different flavors of strings, something similar 
would still be needed, if we want to continue to treat various string flavors 
distinctly using syntax.

Is moving all syntactification (beyond just font-lock) over to TS an explicit 
goal for all the *-ts-mode’s?

It would make sense - since this way we would only have one source of syntax-recognition bugs, rather than two (both the grammar and the definition in Elisp).

Attached is a patch you can try (that uses treesit for s-p-f).

Unfortunately, it's not quite perfect (nor is python-syntax-stringify, according to its FIXME inside): after certain modifications, the syntax-table property is not applied.

I've done some print-debugging in python--treesit-parser-after-change, and it looks like the problem is this: in certain cases (e.g. when electric-pair-post-self-insert-function fires) the parser notifier fires only after syntax-propertize has been called -- and it fires inside of it. Meaning it's too late to flush the syntax-propertize cache at that point.

The reason for it is, overall, the fast that we're trigger parser's after-change notifiers lazily: only after some other feature has to initialize the parser, calling treesit_ensure_parsed from treesit-parser-root-node.

I think bug#66732 might also be a variation of this problem.

As for what to do about this one -- probably something involving syntax-propertize-extend-region-functions, adding an entry which would initialize the parser, but not call syntax-ppss-flush-cache directly (or at least not just that). It would signal the earlier position to extend to through some dynamic variable. This is getting tricky enough to move from the individual major modes into treesit.el proper, I think.

Yuan and others, thoughts welcome.

JD, I do believe the attached patch is TRT (or close to it), but depending on how it works for you, and how quickly we deal with the above problem, it might make sense to enact your original suggestion first.

And finally, here's the backtrace that led me to the above conclusions:

  backtrace()
  (message "in progress, backtrace %s" (backtrace))
  (progn (message "in progress, backtrace %s" (backtrace)))
(if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (save-current-buffer (set-buffer (treesit-parser-buffer parser)) (message "flushing %s up to %s" ranges (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp (car r))) (setq --cl-var-- (if --cl-var-- (min --cl-var-- temp) temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (syntax-ppss-flush-cache (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp (car r))) (setq --cl-var-- (if --cl-var-- (min --cl-var-- temp) temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (message "flushed up to %d, %s" syntax-propertize--done syntax-ppss-wide)) (progn (save-current-buffer (set-buffer (treesit-parser-buffer parser)) (message "flushing %s up to %s" ranges (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp ...)) (setq --cl-var-- (if --cl-var-- ... temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (syntax-ppss-flush-cache (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* ((temp ...)) (setq --cl-var-- (if --cl-var-- ... temp))) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (message "flushed up to %d, %s" syntax-propertize--done syntax-ppss-wide))) (if ranges (progn (save-current-buffer (set-buffer (treesit-parser-buffer parser)) (message "flushing %s up to %s" ranges (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* (...) (setq --cl-var-- ...)) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (syntax-ppss-flush-cache (let* ((--cl-var-- ranges) (r nil) (--cl-var-- nil)) (while (consp --cl-var--) (setq r (car --cl-var--)) (let* (...) (setq --cl-var-- ...)) (setq --cl-var-- (cdr --cl-var--))) --cl-var--)) (if (syntax-propertize--in-process-p) (progn (message "in progress, backtrace %s" (backtrace)))) (message "flushed up to %d, %s" syntax-propertize--done syntax-ppss-wide)))) python--treesit-parser-after-change(((27 . 50)) #<treesit-parser for python>)
  treesit-buffer-root-node(python)
  treesit-node-at(42)
(let ((node (treesit-node-at (point)))) (cond ((equal (treesit-node-type node) "string_content") (put-text-property (- (point) 3) (- (point) 2) 'syntax-table (string-to-syntax "|"))) ((and (equal (treesit-node-type node) "\"") (= (treesit-node-start node) (- (point) 3))) (put-text-property (1- (point)) (point) 'syntax-table (string-to-syntax "|"))))) (cond (t (message "pt %s" (point)) (let ((node (treesit-node-at (point)))) (cond ((equal (treesit-node-type node) "string_content") (put-text-property (- (point) 3) (- (point) 2) 'syntax-table (string-to-syntax "|"))) ((and (equal (treesit-node-type node) "\"") (= (treesit-node-start node) (- ... 3))) (put-text-property (1- (point)) (point) 'syntax-table (string-to-syntax "|"))))))) (while (and (< (point) end) (re-search-forward "\\(?:\"\"\"\\|'''\\)" end t)) (cond (t (message "pt %s" (point)) (let ((node (treesit-node-at (point)))) (cond ((equal (treesit-node-type node) "string_content") (put-text-property (- ... 3) (- ... 2) 'syntax-table (string-to-syntax "|"))) ((and (equal ... "\"") (= ... ...)) (put-text-property (1- ...) (point) 'syntax-table (string-to-syntax "|")))))))) (closure (t) (start end) (goto-char start) (while (and (< (point) end) (re-search-forward "\\(?:\"\"\"\\|'''\\)" end t)) (cond (t (message "pt %s" (point)) (let ((node ...)) (cond (... ...) (... ...)))))))(39 50) funcall((closure (t) (start end) (goto-char start) (while (and (< (point) end) (re-search-forward "\\(?:\"\"\"\\|'''\\)" end t)) (cond (t (message "pt %s" (point)) (let ((node ...)) (cond (... ...) (... ...))))))) 39 50)
  python--treesit-syntax-propertize-function-1(39 50)
  syntax-propertize(42)
  syntax-ppss(42)
  electric-pair-syntax-info(39)
  electric-pair-post-self-insert-function()
  self-insert-command(1 39)
  funcall-interactively(self-insert-command 1 39)
  #<subr call-interactively>(self-insert-command nil nil)
call-interactively@ido-cr+-record-current-command(#<subr call-interactively> self-insert-command nil nil) apply(call-interactively@ido-cr+-record-current-command #<subr call-interactively> (self-insert-command nil nil))
  call-interactively(self-insert-command nil nil)
  command-execute(self-insert-command)

Attachment: python--treesit-syntax-propertize-function.diff
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]