emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reliable after-change-functions (via: Using incremental parsing in E


From: Stefan Monnier
Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs)
Date: Tue, 31 Mar 2020 15:35:41 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

>> > It should be obvious that sending a buffer as a single string is less
>> > efficient than letting tree-sitter access buffer text directly.  We
>> > just need an appropriate API for that (maybe there is one already, I
>> > didn't take a look at their sources since January).
>> My benchmark say that `buffer-string` takes about 1/3 the time of
>> `parse-partial-sexp`, so letting tree-sitter access our buffer text
>> directly is unlikely to give more than a 30% speed up.
> Sure, but we never call parse-partial-sexp on the entire buffer, do we?

Not sure how that's relevant.  I only used `parse-partial-sexp` as
a lower bound on the time tree-sitter is likely to take to do its
own parsing.

>> It doesn't mean it wouldn't be a desirable optimization, but it does
>> mean that it likely won't make a large difference as to whether it's
>> "fast enough".
> I disagree.

Your disagreement doesn't seem to be with what I said: I didn't argue
about the elegance or efficiency, only about the fact that the
performance impact is likely to be small enough that it's not going to
affect the viability of the approach.

> Communicating with a C library by making a string out of buffer text
> is extremely inelegant and inefficient.  We shouldn't do that except
> when the strings are very short.

FWIW, elegant/efficient or not, that's the standard way to do
it, AFAICT.  E.g. that's what we do in `secure-hash`, that's what we do
when parsing JSON, ...

You basically always need to en/decode the content (even if it is into
utf-8, we still need to handle the potential raw-bytes), so a copy is
hard to avoid.

Note that for regexp-matching the problem is slightly different because
we don't know beforehand which part of the buffer will be consulted, so
doing a "copy and then regmatch" would be too inefficient (we'd always
need to copy everything til point-max).


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]