emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree sitter support for C-like languages


From: Theodor Thornhill
Subject: Re: Tree sitter support for C-like languages
Date: Sun, 13 Nov 2022 10:40:26 +0100

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org, 
>> monnier@iro.umontreal.ca
>> Date: Sat, 12 Nov 2022 21:14:21 +0100
>> 
>> Yuan Fu <casouri@gmail.com> writes:
>> 
>> >> See new patch here - following Stefans keen eye ;-)
>> >
>> > Applied and pushed, thanks ;-)
>> 
>> Great news!  Thanks, all!
>
> Thanks.  The new C mode looks good, but I have a couple of issues with
> it.
>

Great - thanks for looking.  I actually have answers too!

> First, something strange is going on when I type new code.  Here's a
> recipe:
>
>    emacs -Q
>    C-x C-f newfile.c RET
>    M-x c-ts-mode RET
>    Type:
>
> int
> foo (void)
> {
>
> At this point, "int" is in font-lock-warning-face -- why?
>

If you enable 'treesit-inspect-mode' and put point on 'int', you will
see it report the 'ERROR' node.  This node is font locked like that
because of the font lock rule I added for that case.  I think we can
remove it, but it does serve some useful purpose.


> Next, with point after the brace, type RET -- this doesn't indent 2
> spaces, as I'd expect -- why?  Typing TAB to indent doesn't help,
> either.
>

This is because tree-sitter doesn't know what to do with it. if you
rather type:

```
int
foo (void)
{}
```

It will know that it has a complete node and indent accordingly if you
press RET while inside the braces.

       (no-node parent-bol c-ts-mode-indent-offset)

Now this indentation should happen as you want, even though we are in an
error state syntax-wise.  At least after you do what you state just below


> I then type "int bar = 0;".  Typing RET after that doesn't indent,
> either.
>

This is for the same reason.  Adding the closing brace would fix that,
or the rule I mentioned.  Now my code is indented like this:

```
int
foo ()
{
  int bar = 0;
```

> But if I add an empty line at BOB, the fontification becomes as
> expected, and doesn't go back to font-lock-warning-face even if I then
> remove that empty line.
>

This is likely due to either treesit or tree-sitter or tree-sitter-c not
dealing properly with the root node.  Maybe Yuan has some insight here?

> Type } to close the function.  I now have this:
>
> int
> foo (void)
> {
>   int bar = 0;
> }
>
> But "int" is still in font-lock-warning-face -- why?
>

I think the best solution is just to remove the

```
   :language mode
   :override t
   :feature 'error
   '((ERROR) @font-lock-warning-face)
```

> Next, I type this:
>
> struct foo {
>   int bar;
> };
>
> The result is that all of the struct, except the closing brace, is in
> font-lock-warning-face -- why?  Again, adding an empty line before
> that fixes fontifications, and the fontification stays correct even
> after removing that empty line.
>
> If I type
>
> struct bar
>   {
>     int foo;
>   };
>

Same thing.  Let's just remove it.  I'll add a patch below, feel free to
install it.

> then the opening brace and "int foo;" are in font-lock-warning-face.
>
> Next, if I type M-;, I get a C++-style comment delimiter "//".  It
> sounds like this is the only style of comments supported?  More
> generally, if I compare c-basic-common-init and c-common-init from CC
> Mode with c-ts-mode, I see that the former has much more
> initializations than the latter.  So I think we should audit what CC
> Mode does here and see what else is relevant.  Alternatively, we could
> consider c-ts-mode be a minor mode of CC Mode, which only changes the
> fontification, the indentation, and the navigation parts.
>

I can take a look at that this evening - and see what else I can come up
with.  I agree with the comment style

> Thanks.
>
> P.S. If these problems are non-trivial, it might be best to file a bug
> report for each one.  But the last issue, the one about doing more
> stuff like CC Mode does, is something we should discuss here, I think,
> since this is basic design, and similar issues could exist for other
> modes whose *-ts-mode variants were installed on the branch.

Your issues are two-fold.  The warning face is super easy, but the
indenting of error nodes may need a change of perspective.  Tree-sitter
works best when syntax is correct, even though it handles errors pretty
well.

See patch


Theo


Attachment: 0001-Remove-error-node-font-locking.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]