emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter integration on feature/tree-sitter


From: Yoav Marco
Subject: Re: Tree-sitter integration on feature/tree-sitter
Date: Tue, 10 May 2022 18:43:49 +0300
User-agent: mu4e 1.6.3; emacs 29.0.50

I benchmarked query compilation reuse:

|   |                                      | no reuse (now) | reuse |
| 1 | Fontify xdisp.c all at once          |          0.01s | 0.01s |
| 2 | Fontify 60 next lines of xdisp.c ×10 |          0.10s | 0.00s |
| 3 | Fontify 60 next lines till the end   |          6.06s | 0.01s |


The patch to reuse the query is pretty dumb: if the char* for the query
string didn't change from last time, it reuses the TSQuery object from
last time instead of calling ts_new_query again. The patch is attached.

The elisp code for the benchmarks is also attached, but I'll give a
summary here:

The queries are tree-sitter-langs' highlights.scm for C.

Benchmark 1 runs treesit-font-lock-fontify-region once on the entire
buffer, meaning the query is compiled only once in both cases

Benchmark 2 runs treesit-font-lock-fontify-region on blocks of 60 lines,
meaning the no reuse version has to compile the query 10 times even
though nothing changes in the buffer or query.

Benchmark 3 is just 2 done all the way. xdisp.c has 36k lines, so the
6.06s is consistent
(600 lines = 0.10s, multiply by 60 ⇒ 36k lines ~= 6.00s).


So, is caching worth it? I don't know. It definetily is if it's possible
to do it internally without introducing a new object type. But I don't
think that's possible without making a hash map or a complicated cache
like the one for compiled regexps that compile_pattern uses in
search.c.


-- Yoav

Attachment: bench.tar.gz
Description: application/gzip

Attachment: 0001-Reuse-queries-in-a-dumb-way.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]