emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter integration on feature/tree-sitter


From: Theodor Thornhill
Subject: Re: Tree-sitter integration on feature/tree-sitter
Date: Tue, 17 May 2022 23:45:53 +0200

Yuan Fu <casouri@gmail.com> writes:

>> On May 13, 2022, at 10:03 PM, Theodor Thornhill <theo@thornhill.no> wrote:
>> 
>>> 
>>> Now there is treesit-beginning/end-of-defun. You just need to set 
>>> treesit-defun-query and everything else come for free. I needed to invent 
>>> some heavy machinery for that, resulting in some new handy functions:
>>> 
>>> - treesit-traverse-depth-first
>>> - treesit-traverse-breadth-first
>>> - treesit-traverse-forward-depth-first (maybe this should be named simply 
>>> treesit-traverse-forward?)
>>> 
>>> - treesit-search-forward
>>> - treesit-search-beginning
>>> They are untested & undocumented (in manual), so please play with them and 
>>> report problems :-)
>>> 

I've been testing the provided functionality for beginning/end-of-defun,
and I have some thoughts I'd like to share.

For starters, let me just give some context.  The implementation I've
used so far before the provided version looks something like
```
(defun typescript-mode-move-to-node (fn)
  (when-let ((found-node
              (treesit-parent-until
               (treesit-node-at (point))
               (lambda (parent)
                 (treesit-query-capture
                  parent
                  typescript-mode--defun-query)))))
    (goto-char (funcall fn found-node))))

(defun typescript-mode-beginning-of-defun (&optional arg)
  (typescript-mode-move-to-node #'treesit-node-start))

(defun typescript-mode-end-of-defun (&optional arg)
  (typescript-mode-move-to-node #'treesit-node-end))
```

If this is given a query such as

```
(defvar typescript-mode--defun-query
  "[(import_statement)
    (function_declaration)
    (type_alias_declaration)
    (interface_declaration)
    (lexical_declaration)] @defun")
```

we would traverse parentwise and locate a node on match.  This
implementation is very fast, but has an issue - it will only match in
the parentwise path, so siblings will not be found.  This makes my
function useful, but not general enough.  The version provided in-tree
right now uses the depth first approach, which has two big problems -
performance and inconsistency.

Its docstring notes:
```
Traversing forward depth-first means, for a tree like the below
where NODE is marked 1, traverse as numbered:

                16
                |
       3--------4-----------8
       |        |           |
  o--o-+--1  5--+--6    9---+-----12
  |  |    |        |    |         |
  o  o    2        7  +-+-+    +--+--+
                      |   |    |  |  |
                      10  11   13 14 15
```

This means that if we start at node 1, I'd expect us to navigate to the
nodes 3 - 4 - 8 - 16, when repeatedly pressing the beginning-of-defun.
However, because we go depth first, we can end up landing at say, node
14, which feels unnatural.  This can happen for example in javascript if
we add arrow_function to the nodes to match.  If node 14 contains such a
node, the traversing order would look like this: 3 - 4 - 8 - 14 - 16.
This feels odd, or at least differs from how normal emacs operates.  In
addition, when the search gets long, it can take up to a second on my
system to find the beginning of a defun, because of the amount of
traversing required by the depth first algorithm.

I have a suggestion for a solution that you may consider.

Either add a new defcustom 'treesit-depth-first-go-deep', or add a new
param to 'treesit-traverse-depth-first', like so:
```
(defun treesit-traverse-depth-first (node pred &optional step go-deep)
  (if (funcall pred node)
      node
    (and go-deep
      (cl-loop for child in (if (or (null step) (>= step 0))
                              (treesit-node-children node)
                            (nreverse (treesit-node-children node)))
             if (treesit-traverse-depth-first child pred step)
             return child))))
```

This way we can avoid traversing deep into the subtrees, which is a slow
operation, _and_ makes for an inconsistent experience.  Setting go-deep
as nil makes the function really fast, and also keeps the benefit of
finding siblings.

Another option is to not provide a generic depth first algorithm, and
only go for siblings and parents, but we may want the depth first for
other things, such as a generic 'treesit-goto-thing' function.

What do you think?

Theodor



reply via email to

[Prev in Thread] Current Thread [Next in Thread]