Re: Tokenizing

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tokenizing

From:	Stephen Leake
Subject:	Re: Tokenizing
Date:	Sun, 21 Sep 2014 10:32:29 -0500
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.3 (windows-nt)

Vladimir Kazanov <address@hidden> writes:

> Okay, I'll give text properties a try.
>
> Right now my vision for this mode is the following:
>
> - avoid retokenizing undamaged buffer parts at all costs (as a main
> feature meant for incremental parsing);

You might look at what I did in Ada mode (current source in ELPA); see
wisi.el wisi-before/after-change.

> - collect damages and do reparsing only when user stops editing,
> similar to the font-lock-mode (js2-mode, nxml-mode...);

Ada mode only reparses when the user requests an action that requires a
parse. How else do you tell when a user "stops editing"?

font-lock runs after 'idle-time', which appears to be about 2 seconds (I
could not figure out from the structure of 'timer-idle-list' what the
actual idle time is). I guess that's the approximation of when the user
stops editing.

I don't normally edit 7000 line files, so the Ada mode parsing delay is
not noticeable to me, so I prefer the current Ada mode approach of not
using the idle timer to trigger a parse. But it could be a user option. 

> - the incremental logic should have two interfaces, the first one
> meant for language-specific tokenizing code and a second one - for the
> user code, be it code beautifiers or advanced incremental parsers;
>
> - it should be possible to completely replace the font-lock-mode with
> this mode, given a concrete language tokenizer;
>
> You said two things basically: 1) I must use text properties, 2) it is
> possible to improve text properties interfaces to help the tokenizer.
> I suggest the following plan:
>
> 1) try to implement the tokenizer using available text property
> mechanics;

Ada mode uses text properties to store parse results; the tokenizer
results are part of that, but are not stored separately. I don't see
much point in separating the tokenizer from the parser; the tokenizer
results are not useful by themselves (at least, not in Ada mode).

> 2) see if there are slow-downs or problems, or space for improvements
> on the Emacs side.

I have not noticed any problems with the text properties interface; in
particular, storing and retrieving text properties is fast compared to
parsing. Ada mode stores about two parse result text properties per
source line on average.

-- 
-- Stephe

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Tokenizing, (continued)

Prev by Date: Re: parsing (was tokenizing)
Next by Date: Re: Tokenizing
Previous by thread: Re: Tokenizing
Next by thread: Re: Tokenizing
Index(es):
- Date
- Thread