emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: feature/tree-sitter: Where to Put C/C++ Stuff


From: Yuan Fu
Subject: Re: feature/tree-sitter: Where to Put C/C++ Stuff
Date: Tue, 1 Nov 2022 02:22:37 -0700

Before we jump into discussions, I want to note that many of your (Theo’s) 
arguments seem to be against cc-mode rather than “using the same major mode”. 
For major modes that doesn’t use cc-mode (like python-mode), tree-sitter and 
non-tree-sitter features so far coexist just fine.

>> 
>> That'd mean people will need either to invent all the other goodies in
>> CC mode (everything except fontifications and indentation) from
>> scratch, or give up all those other goodies.  Does that make sense?
>> 
> 
> Yes, well, partially.  I think that we are too likely to create unwanted
> issues by merging the two too closely.  I have seen several of these
> issues the last couple of years while implementing c-sharp mode in cc
> mode, emacs-tree-sitter and treesit.  There are several things that are
> happening.  I'll try to expand on some of them just to create some
> perspective, but also for some specific points where we can improve to
> maybe don't have a problem with this at all.
> 
> 1: Use CC mode for one thing and tree-sitter for the rest
> While first implementing tree-sitter in c-sharp mode we tried just
> applying font-locking, and use cc mode for indentation and the rest.
> What happened was that we immediately inherited the performance issues
> from cc mode straight into our code.  Specifically, when typing in a
> file with too many (from cc mode's perspective) strings, typing lag rose
> to several seconds per press.  I filed several bug reports on this both
> here and to Alan.  After some time and much heroics we got some
> improvement on this from Alan, but c-sharp already had moved on.
> 
> 2: Using separate names for modes.
> The great advantage here is easy to understand.  You have no inheritance
> issues, and are free to develop features without regards to legacy.  A
> disadvantage is that some users depend on that major mode name for other
> stuff.  We had some issues filed with us to flip over to tree-sitter
> completely, because that name (csharp-mode) was so important compared to
> (csharp-tree-sitter-mode).  We almost made the change, but then Yuan
> started his work so we waited.  This would have sunsetted the cc mode
> almost immediately
> 
> 3: Confusion with where to file bugs
> We have many bugs in c-sharp mode where some things are emacs bugs, some
> things are cc mode bugs, some are treesitter bugs and some are our own
> bugs.  There is a real issue with understanding cc mode and figuring out
> where a bug fix should end up.  It has taken me many weeks worth of
> digging to understand only the simplest mechanisms of cc mode.
> Tree-sitter takes contributors only a couple of hours to be immediately
> productive.  To disregard this point with only compatibility with cc
> mode is a huge mistake, IMO.
> 
> 4: How do we know what to disable?
> If there's a problem somewhere in the tree-sitter variant of the cc mode
> derived new mode, and we see some issue - who makes the fix?  For
> example, previously there was limited support for multiline strings in
> cc mode, which took almost a year to finalize.  The tree-sitter variant
> with more performance and accuracy took me maybe 20 minutes in a
> work-meeting.  Should a feature that is simple to implement in the
> tree-sitter variant wait for a similar cc mode implementation?  The
> namespacing seems to suggest that yes, it should.

I don’t think it should (which I think we both agree). And I don’t think it’s 
any problem if a major mode has some tree-sitter-powered feature that the 
non-tree-sitter version doesn’t have.

> 
> 5: While tree-sitter is only an engine, it provides a lot more goodies
> We have a huge opportunity to create real new frameworks for emacs now,
> but limiting us to merge the features/modes suggests that we cannot
> reliably do overarching advancements such as we see now in the
> feature/tree-sitter branch.  For example, many small hacks I've made in
> the modes I've submitted thus far has made it into general mechanisms in
> treesit.el.  All modes that enable tree-sitter should be able to use
> these and all the new that come _without_ worrying whether or not some
> issue will crop up from inheriting from cc mode or some other thing.
> Examples are indentation styles, paredit-like funciontalities,
> refactorings and more.
> 
> 6: What are the goodies that we really need from CC mode?
> CC mode provides indentation and font locking.  What else does it
> provide that isn't replaceable pretty quickly?  I mean this not as a
> contrarian, but out of real curiosity.  

One thing I found, which might be the only thing, is filling, specifically 
filling the /* */ style comments while respecting all style of drawing stars in 
these comments. I mean all the style like

/*
 *
 */

/*=====================================

=======================================*/

Etc, etc. I tried to look at c-mask-paragraph, and it is very complicated. 
Maybe we can use c-fill-paragraph without setting up the rest of cc-mode?

> My guess is that we can get to
> feature parity and well beyond that in a very short amount of time, if
> we're not hindered by merging everything.
> 
> 
> Sorry for the long mail, but I think we are missing the point by viewing
> tree-sitter simply as an engine to plop in aside cc mode for
> convenience, and not the real infrastructure change it is.  There is no
> need to sunset cc mode, but equally there is no need to limit tree-sitter.
> 

If mixing cc-mode and tree-sitter brings more problem than merit, maybe we can 
adopt a mutual exclusive policy, where a major mode either sets up cc-mode or 
uses tree-sitter, but never together.

> 
>> Tree-sitter doesn't (and cannot) replace everything a major mode does
>> for a programming language.  So a completely new mode means we through
>> the baby with the bathwater.
> 
> I don't agree, but I'm very curious to what else would take a
> significant effort _apart_ from indentation feature parity with cc mode is.

Tree-sitter is just a tool, obviously there are things a major mode provides 
that doesn’t involve a parser, eg, python’s REPL. But I see no prblem putting 
this feature alongside tree-sitter features in the same major mode.

> 
> One thing I know of is integration with package managers such as what
> elm-mode and go-mode does, but that is an easy fix.  The upstream
> go-mode, if not possible to move to core can just derive from a simple
> go-treesit, skip all indentation and font-locking in its own mode, but
> supply the goodies.
> 
> -- 
> Theo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]