emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch suggestion] Mitigating the poor Emacs performance on huge org


From: Ihor Radchenko
Subject: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Date: Fri, 05 Jun 2020 16:18:59 +0800

> See also `gensym'. Do we really need to use it for something else than
> `invisible'? If not, the tool doesn't need to be generic.

For now, I also use it for buffer-local 'invisible stack. The stack is
needed to preserve folding state of drawers/blocks inside folded
outline. Though I am thinking about replacing the stack with separate
text properties, like 'invisible-outline-buffer-local +
'invisible-drawer-buffer-local + 'invisible-block-buffer-local.
Maintaining stack takes a noticeable percentage of CPU time in profiler.

org--get-buffer-local-text-property-symbol must take care about
situation with indirect buffers. When an indirect buffer is created from
some org buffer, the old value of char-property-alias-alist is carried
over. We need to detect this case and create new buffer-local symbol,
which is unique to the newly created buffer (but not create it if the
buffer-local property is already there). Then, the new symbol must
replace the old alias in char-property-alias-alist + old folding state
must be preserved (via copying the old invisibility specs into the new
buffer-local text property). I do not see how gensym can benefit this
logic.

> OK, but this may not be sufficient if we want to do slightly better than
> overlays in that area. This is not mandatory, though.

Could you elaborate on what can be "slightly better"? 

> As discussed before, I don't think you need to use `modification-hooks'
> or `insert-behind-hooks' if you already use `after-change-functions'.
>
> `after-change-functions' are also triggered upon text properties
> changes. So, what is the use case for the other hooks?

The problem is that `after-change-functions' cannot be a text property.
Only `modification-hooks' and `insert-in-front/behind-hooks' can be a
valid text property. If we use `after-change-functions', they will
always be triggered, regardless if the change was made inside or outside
folded region.

>> :asd:
>> :drawer:
>> lksjdfksdfjl
>> sdfsdfsdf
>> :end:
>>
>> If :asd: was inserted in front of folded :drawer:, changes in :drawer:
>> line of the new folded :asd: drawer would reveal the text between
>> :drawer: and :end:.
>>
>> Let me know what you think on this.

> I have first to understand the use case for `modification-hook'. But
> I think unfolding is the right thing to do in this situation, isn't it?

That situation arises because the modification-hooks from ":drawer:"
(they are set via text properties) only have information about the
:drawer:...:end: drawer before the modifications (they were set when
:drawer: was folded last time). So, they will only unfold a part of the
new :asd: drawer. I do not see a simple way to unfold everything without
re-parsing the drawer around the changed text.

Actually, I am quite unhappy with the performance of modification-hooks
set via text properties (I am using this patch on my Emacs during this
week). It appears that setting the text properties costs a significant
CPU time in practice, even though running the hooks is pretty fast.
I will think about a way to handle modifications using global
after-change-functions.

> `org--get-element-region-at-point' is certainly faster, but it is also
> wrong, unfortunately.
>
> Org syntax is not context-free grammar. If you try to parse it locally,
> starting from anywhere, it will fail at some point. For example, your
> function would choke in the following case:
>
>     [fn:1] Def1
>     #+begin_something
>
>     [fn:2] Def2
>     #+end_something

I see. 

> AFAIK, the only proper way to parse it is to start from a known position
> in the buffer. If you have no information about the buffer, the headline
> above is the position you want. With cache could help to start below.
> Anyway, in this particular case, you should not use
> `org--get-element-region-at-point'.

OK

Best,
Ihor

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> Hello,
>
> Ihor Radchenko <yantar92@gmail.com> writes:
>
>> [The patch itself will be provided in the following email]
>
> Thank you.
>
>> I have found char-property-alias-alist variable that controls how Emacs
>> calculates text property value if the property is not set. This variable
>> can be buffer-local, which allows independent 'invisible states in
>> different buffers.
>
> Great. I didn't know about this variable!
>
>> All the implementation stays in
>> org--get-buffer-local-text-property-symbol, which takes care about
>> generating unique property name and mapping it to 'invisible (or any
>> other) text property.
>
> See also `gensym'. Do we really need to use it for something else than
> `invisible'? If not, the tool doesn't need to be generic.
>
>> I simplified the code as suggested, without using pairs of before- and
>> after-change-functions.
>
> Great!
>
>> Handling text inserted into folded/invisible region is handled by a
>> simple after-change function. After testing, it turned out that simple
>> re-hiding text based on 'invisible property of the text before/after the
>> inserted region works pretty well.
>
> OK, but this may not be sufficient if we want to do slightly better than
> overlays in that area. This is not mandatory, though.
>
>> Modifications to BEGIN/END line of the drawers and blocks is handled via
>> 'modification-hooks + 'insert-behind-hooks text properties (there is no
>> after-change-functions analogue for text properties in Emacs). The
>> property is applied during folding and the modification-hook function is
>> made aware about the drawer/block boundaries (via apply-partially
>> passing element containing :begin :end markers for the current
>> drawer/block). Passing the element boundary is important because the
>> 'modification-hook will not directly know where it belongs to. Only the
>> modified region (which can be larger than the drawer) is passed to the
>> function. In the worst case, the region can be the whole buffer (if one
>> runs revert-buffer).
>
> As discussed before, I don't think you need to use `modification-hooks'
> or `insert-behind-hooks' if you already use `after-change-functions'.
>
> `after-change-functions' are also triggered upon text properties
> changes. So, what is the use case for the other hooks?
>
>> It turned out that adding 'modification-hook text property takes a
>> significant cpu time (partially, because we need to take care about
>> possible existing 'modification-hook value, see
>> org--add-to-list-text-property). For now, I decided to not clear the
>> modification hooks during unfolding because of poor performance.
>> However, this approach would lead to partial unfolding in the following
>> case:
>>
>> :asd:
>> :drawer:
>> lksjdfksdfjl
>> sdfsdfsdf
>> :end:
>>
>> If :asd: was inserted in front of folded :drawer:, changes in :drawer:
>> line of the new folded :asd: drawer would reveal the text between
>> :drawer: and :end:.
>>
>> Let me know what you think on this.
>
> I have first to understand the use case for `modification-hook'. But
> I think unfolding is the right thing to do in this situation, isn't it?
>
>> My simplified implementation of element boundary parser
>> (org--get-element-region-at-point) appears to be much faster and also
>> uses much less memory in comparison with org-element-at-point.
>> Moreover, not all the places where org-element-at-point is called
>> actually need the full parsed element. For example, org-hide-drawer-all,
>> org-hide-drawer-toggle, org-hide-block-toggle, and
>> org--hide-wrapper-toggle only need element type and some information
>> about the element boundaries - the information we can get from
>> org--get-element-region-at-point.
>
> [...]
>
>> What do you think about the idea of making use of
>> org--get-element-region-at-point in org code base?
>
> `org--get-element-region-at-point' is certainly faster, but it is also
> wrong, unfortunately.
>
> Org syntax is not context-free grammar. If you try to parse it locally,
> starting from anywhere, it will fail at some point. For example, your
> function would choke in the following case:
>
>     [fn:1] Def1
>     #+begin_something
>
>     [fn:2] Def2
>     #+end_something
>
> AFAIK, the only proper way to parse it is to start from a known position
> in the buffer. If you have no information about the buffer, the headline
> above is the position you want. With cache could help to start below.
> Anyway, in this particular case, you should not use
> `org--get-element-region-at-point'.
>
> Hopefully, we don't need to parse anything. In an earlier message,
> I suggested a few checks to make on the modified text in order to decide
> if something should be unfolded, or not. I suggest to start from there,
> and fix any shortcomings we might encounter. We're replacing overlays:
> low-level is good in this area.
>
> WDYT?
>
>
> Regards,
>
> -- 
> Nicolas Goaziou

-- 
Ihor Radchenko,
PhD,
Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong 
University, Xi'an, China
Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg



reply via email to

[Prev in Thread] Current Thread [Next in Thread]