emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter integration on feature/tree-sitter


From: Yoav Marco
Subject: Re: Tree-sitter integration on feature/tree-sitter
Date: Thu, 12 May 2022 19:26:50 +0300
User-agent: mu4e 1.6.3; emacs 29.0.50

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Yoav Marco <yoavm448@gmail.com>
>> Cc: Yuan Fu <casouri@gmail.com>, emacs-devel@gnu.org
>> Date: Thu, 12 May 2022 17:16:41 +0300
>>
>> And it probably is: in my benchmark, query compilation improved
>> performance in much more than 16/6=266%: it went from 6.06 to 0.01.
>
> That was in one of the tests, which, AFAIU, is not very interesting
> for assessing the effect on practical use cases in Emacs usage.  Or
> are you saying that Yuan's explanation of what that test tested was
> incorrect? in that case, please post the correct explanation.

Sorry, I'm saying I'm not sure how he got to the fraction of how much
time it takes to compile a query.

How I understand it, if it takes 23.474s to fontify 2332 times without
query caching and 0.037s with, then 99.7% of the time is spent in
recompiling the same query, or (23.474 - 0.037)/2332 = 10ms per
fontification. Which, uh, is what Yuan said, but I don't know how he
reached the "0.0158s per call to font-lock-region".

>> > According to your benchmarks, it is already very fast: 16 msec is a
>> > negligible time interval.  Of course, 40 is a somewhat arbitrary
>> > number, but to get a less arbitrary one, we should determine it from
>> > some concrete scenarios, such as the 512-character chunk JIT font-lock
>> > uses during redisplay, or the number of lines on a typical window
>> > that's important when one scrolls with C-v/M-v, etc.
>>
>> It's easy enough to convert the benchmarks to 512-chars chunks rather
>> than 40 lines. See table a few paragraphs below.
>
> I'm sorry, I don't understand how to interpret that table.  Can you
> please explain the two last entries in the left column?

Explaination for the whole table:

|   |                     | font-lock | TS sexp |     TS | TS query reuse |
| 1 | xdisp.c all at once |    12.886 |   0.031 |  0.016 |          0.017 |
| 2 | 20 × 512c           |     0.273 |   0.214 |  0.209 |          0.000 |
| 3 | 512c to end         |       4m+ |  24.177 | 23.474 |          0.037 |

Rows:
- Benchmark 1 xdisp.c all at once: run font-lock-font-lock-fontify-region
  on the entire buffer once
- Benchmark 2 20 × 512c: fontify the next 512 characters 20 times
- Benchmark 2 20 × 512c: fontify the next 512 characters until the
  buffer ends

Columns:
- font-lock: fontifying using c-mode's font-lock setup
- TS sexp: using current non-caching treesit, but giving it the query as
  a sexp and not as a string
- TS: current non-caching treesit, but supplying query as string
- TS query reuse: caching compiled query objects using my dumb patch
  that just reuses the last query object as long as the char* for the
  query string doesn't change


>> >> If we expose "compiled query” we don’t need to cache them either.
>> >
>> > Then the Lisp program will have to do that, which is even worse,
>> > because the problems I described will now have to be solved by Lisp
>> > application programmers, each time anew.
>>
>> Will they? They'd just need to compile their queries once, when defining
>> them or when setting treesit-font-lock-defaults.
>
> And decide when to discard them.

I thought garbage collection could take care of that. Is that
problematic?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]