[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Reliable after-change-functions (via: Using incremental parsing in E
From: |
Tuấn-Anh Nguyễn |
Subject: |
Re: Reliable after-change-functions (via: Using incremental parsing in Emacs) |
Date: |
Thu, 2 Apr 2020 00:55:45 +0700 |
On Wed, Apr 1, 2020 at 8:26 PM Eli Zaretskii <address@hidden> wrote:
>
> > From: Tuấn Anh Nguyễn <address@hidden>
> > Date: Wed, 1 Apr 2020 13:17:42 +0700
> > Cc: address@hidden
> >
> > Real usage with "xdisp.c":
> >
> > (define-advice tree-sitter--do-parse (:around (f &rest args) benchmark)
> > (message "%s" (benchmark-run (apply f args))))
> >
> > (0.257998 1 0.13326100000000096)
>
> And that is even without encoding the buffer text, IIUC what the
> package does.
>
> > So yes, direct access to buffer's text from dynamic modules would be nice.
>
> Did you consider using the API where an application can provide a
> function to return text at a given offset? Such a function could be
> relatively easily implemented for Emacs.
>
I don't understand what you mean. Below I'll explain how it works
currently.
`ts-parse' uses the Tree-sitter's API that consumes text in chunks:
TSTree *ts_parser_parse(
TSParser *self,
const TSTree *old_tree,
TSInput input
);
typedef struct {
void *payload;
const char *(*read)(
void *payload,
uint32_t byte_offset,
TSPoint position,
uint32_t *bytes_read
);
TSInputEncoding encoding;
} TSInput;
Because dynamic modules don't have direct access to buffer text,
`ts-parse' uses the module function `copy_string_contents', and exposes
this interface:
(ts-parse PARSER INPUT-FUNCTION OLD-TREE)
Here INPUT-FUNCTION must return a chunk of the buffer text, starting
from the given byte offset, as a Lisp string. `ts-buffer-input' is one
such function.
So:
1. Chunks of the buffer text are copied into Lisp strings, through
`buffer-substring-no-properties'.
2. These Lisp strings are copied into buffers of null-terminated utf-8
bytes, through `copy_string_contents'.
3. All these temporary Lisp strings create GC pressure. In the xdisp.c
example, it was 100ms for GC, in addition to 150ms for parsing.
4. emacs-module-rs has an automatic, blanket workaround for this bug
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=31238. The workaround
involves pairs of `make_global_ref' and `free_global_ref' calls, on
all "suspected" `emacs_value's.
#4 can be avoided if emacs-module-rs allows selectively disabling the
blanket workaround. It's band-aid on top of band-aid, but at least it's
workable.
#3 can probably be alleviated by increasing the chunk size.
However, they are consequences of #1 and #2. If dynamic modules have
direct access to the buffer text, none of the above is an issue.
Such direct access can be enabled by something like this:
char* (*access_buffer_text) (emacs_env *env,
emacs_value buffer,
ptrdiff_t byte_offset,
ptrdiff_t *size_inout);
Of course, such an API would require extensive documentation on how it
must be used, to ensure safety and correctness.
> Btw, what do you do with the tree returned by the tree-sitter parser?
> store it in some buffer-local variable? If so, how much memory does
> such a tree take, and when, if ever, is that memory released?
>
It's stored in a buffer-local variable. I haven't measured the memory
they take. Memory is released when the tree object is garbage-collected
(it's a `user-ptr').
--
Tuấn-Anh Nguyễn
Software Engineer
Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Tuấn Anh Nguyễn, 2020/04/01
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Eli Zaretskii, 2020/04/01
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs),
Tuấn-Anh Nguyễn <=
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Eli Zaretskii, 2020/04/01
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Stephen Leake, 2020/04/01
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Stephen Leake, 2020/04/01
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Stefan Monnier, 2020/04/01
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Tuấn-Anh Nguyễn, 2020/04/02
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Eli Zaretskii, 2020/04/02
- Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Stefan Monnier, 2020/04/02
- Re: [SPAM UNSURE] Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Stephen Leake, 2020/04/02
- Re: [SPAM UNSURE] Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Eli Zaretskii, 2020/04/03
- Re: [SPAM UNSURE] Re: Reliable after-change-functions (via: Using incremental parsing in Emacs), Stephen Leake, 2020/04/03