Re: [patch suggestion] Mitigating the poor Emacs performance on huge org

emacs-orgmode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch suggestion] Mitigating the poor Emacs performance on huge org

From:	Ihor Radchenko
Subject:	Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Date:	Mon, 08 Jun 2020 13:05:08 +0800

Hello,

[The patch itself will be provided in the following email]

I have four more updates from the previous version of the patch:

1. All the code handling modifications in folded drawers/blocks is moved
   to after-change-function. It works as follows:
   - if any text is inserted in the middle of hidden region, that text
     is also hidden;
   - if BEGIN/END line of a folded drawer do not match org-drawer-regexp
     and org-property-end-re, unfold it; 
   - if org-property-end-re or new org-outline-regexp-bol is inserted in
     the middle of the drawer, unfold it;
   - the same logic for blocks.

2. The text property stack is rewritten using char-property-alias-alist.
   This is faster in comparison with previous approach, which involved
   modifying all the text properties every timer org-flag-region was
   called. 

3. org-toggle-custom-properties-visibility is rewritten using text
   properties. I also took a freedom to implement a new feature here.
   Now, setting new `org-custom-properties-hide-emptied-drawers' to
   non-nil will result in hiding the whole property drawer if it
   contains only org-custom-properties.

4. This patch should work against 1aa095ccf. However, the merge was not
   trivial here. Recent commits actively used the fact that drawers and
   outlines are hidden via 'outline invisibility spec, which is not the
   case in this branch. I am not confident that I did not break anything
   during the merge, especially 1aa095ccf.

-----------------------------------------------------------------------
-----------------------------------------------------------------------

More details on the new implementation for tracking changes:

> I gave you a few ideas to quickly check if a change requires expansion,
> in an earlier mail. I suggest to start out from that. Let me know if you
> have questions about it.

All the code lives in org-after-change-function. I tried to incorporate
the earlier Nicholas' suggestions, except the parts related to
intersecting blocks and drawers. I am not sure if I understand the
parsing priority of blocks vs. drawers.

-----------------------------------------------------------------------
-----------------------------------------------------------------------

More details on the text property stack:

The earlier version of the code literally used stack to save
pre-existing 'invisibility specs in org-flag-region. This was done on
every invocation of org-flag-region, which made org-flag-region
significantly slower. I re-implemented the same feature using
char-property-alias-alist. Now, different invisibility specs live in
separate text properties and can be safely modified independently. The
specs are applied according to org--invisible-spec-priority-list. A side
effect of current implementation is that char-property-alias-alist is
fully controlled by org. All the pre-existing settings for 'invisible
text property will be overwritten by org.

> `gensym' is just a shorter, and somewhat standard way, to create a new
> uninterned symbol with a given prefix. You seem to re-invent it. What
> you do with that new symbol is orthogonal to that suggestion, of course.

I do not think that `gensym' is suitable here. We don't want a new
symbol every time org--get-buffer-local-invisible-property-symbol is
called. It should return the same symbol if it is called from the same
buffer multiple times.

-----------------------------------------------------------------------
-----------------------------------------------------------------------

More details on the org-toggle-custom-properties-visibility:

The implementation showcases how to introduce new invisibility specs to
org. Apart from expected (add-to-invisibility-spec 'org-hide-custom-property) 
one also needs to add the spec into org--invisible-spec-priority-list:

(add-to-list 'org--invisible-spec-priority-list 'org-hide-custom-property)

Searching for text with the given invisibility spec is done as
follows:

(text-property-search-forward (org--get-buffer-local-invisible-property-symbol 
'org-hide-custom-property) 'org-hide-custom-property t)

This last piece of code is probably not the most elegant. I am thinking
if creating some higher-level interface would be more reasonable here.
What do you think?

The new customisation `org-custom-properties-hide-emptied-drawers'
sounds logical for me since empty property drawers left after invoking
org-toggle-custom-properties-visibility are rather useless according to
my experience. If one already wants to hide parts of property drawers, I
do not see a reason to show leftover

:PROPERTIES:
:END:

-----------------------------------------------------------------------
-----------------------------------------------------------------------

More details on the merge with the latest master:

I tried my best to not break anything. However, I am not sure if I
understand all the recent commits. Could someone take a look if there is
anything suspicious in org-next-visible-heading?

Also, I have seen some optimisations making use of the fact that drawers
and headlines both use 'outline invisibility spec. This change in the
implementation details supposed to improve performance and should not be
necessary if this patch is going to be merged. Would it be possible to
refrain from abusing this particular implementation detail in the
nearest commits on master (unless really necessary)?

-----------------------------------------------------------------------
-----------------------------------------------------------------------

Further work:

I would like to finalise the current patch and work on other code using
overlays separately. This patch is already quite complicated as is. I do
not want to introduce even more potential bugs by working on things not
directly affected by this version of the patch.

Best,
Ihor

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> Ihor Radchenko <yantar92@gmail.com> writes:
>
>>> See also `gensym'. Do we really need to use it for something else than
>>> `invisible'? If not, the tool doesn't need to be generic.
>>
>> For now, I also use it for buffer-local 'invisible stack. The stack is
>> needed to preserve folding state of drawers/blocks inside folded
>> outline. Though I am thinking about replacing the stack with separate
>> text properties, like 'invisible-outline-buffer-local +
>> 'invisible-drawer-buffer-local + 'invisible-block-buffer-local.
>> Maintaining stack takes a noticeable percentage of CPU time in profiler.
>>
>> org--get-buffer-local-text-property-symbol must take care about
>> situation with indirect buffers. When an indirect buffer is created from
>> some org buffer, the old value of char-property-alias-alist is carried
>> over. We need to detect this case and create new buffer-local symbol,
>> which is unique to the newly created buffer (but not create it if the
>> buffer-local property is already there). Then, the new symbol must
>> replace the old alias in char-property-alias-alist + old folding state
>> must be preserved (via copying the old invisibility specs into the new
>> buffer-local text property). I do not see how gensym can benefit this
>> logic.
>
> `gensym' is just a shorter, and somewhat standard way, to create a new
> uninterned symbol with a given prefix. You seem to re-invent it. What
> you do with that new symbol is orthogonal to that suggestion, of course.
>
>>> OK, but this may not be sufficient if we want to do slightly better than
>>> overlays in that area. This is not mandatory, though.
>>
>> Could you elaborate on what can be "slightly better"? 
>
> IIRC, I gave examples of finer control of folding state after a change.
> Consider this _folded_ drawer:
>
>   :BEGIN:
>   Foo
>   :END:
>
> Inserting ":END" in it should not unfold it, as it is currently the case
> with overlays,
>
>   :BEGIN
>   Foo
>   :END
>   :END:
>
> but a soon as the last ":" is inserted, the initial drawer could be
> expanded.
>
>   :BEGIN
>   Foo
>   :END:
>   :END:
>
> The latter case is not currently handled by overlays. This is what
> I call "slightly better".
>
> Also, note that this change is not related to opening and closing lines
> of the initial drawer, so sticking text properties on them would not
> help here.
>
> Another case is modifying those borders, e.g.,
>
>
>   :BEGIN:               :BEGIN:
>   Foo         ------>   Foo  
>   :END:                 :ND:
>
> which should expand the drawer. Your implementation catches this, but
> I'm pointing out that current implementation with overlays does not.
> Even though that's not strictly required for compatibility with
> overlays, it is a welcome slight improvement.
>
>>> As discussed before, I don't think you need to use `modification-hooks'
>>> or `insert-behind-hooks' if you already use `after-change-functions'.
>>>
>>> `after-change-functions' are also triggered upon text properties
>>> changes. So, what is the use case for the other hooks?
>>
>> The problem is that `after-change-functions' cannot be a text property.
>> Only `modification-hooks' and `insert-in-front/behind-hooks' can be a
>> valid text property. If we use `after-change-functions', they will
>> always be triggered, regardless if the change was made inside or outside
>> folded region.
>
> As discussed, text properties are local to the change, but require extra
> care when moving text around. You also observed serious overhead when
> using them.
>
> OTOH, even if `a-c-f' is not local, you can quickly determine if the
> change altered a folded element, so the overhead is limited, i.e.,
> mostly checking for a text property at a given buffer position.
>
> To be clear, I initially thought that text properties were a superior
> choice, but I changed my mind a while ago, and I thought you had, too.
> IOW, `after-change-functions' is the way to go, since you have no strong
> reason to stick to text properties for this kind of function.
>
>>>> :asd:
>>>> :drawer:
>>>> lksjdfksdfjl
>>>> sdfsdfsdf
>>>> :end:
>>>>
>>>> If :asd: was inserted in front of folded :drawer:, changes in :drawer:
>>>> line of the new folded :asd: drawer would reveal the text between
>>>> :drawer: and :end:.
>>>>
>>>> Let me know what you think on this.
>>
>>> I have first to understand the use case for `modification-hook'. But
>>> I think unfolding is the right thing to do in this situation, isn't it?
>>
>> That situation arises because the modification-hooks from ":drawer:"
>> (they are set via text properties) only have information about the
>> :drawer:...:end: drawer before the modifications (they were set when
>> :drawer: was folded last time). So, they will only unfold a part of the
>> new :asd: drawer. I do not see a simple way to unfold everything without
>> re-parsing the drawer around the changed text.
>
> Oh! I misread your message. I withdraw what I wrote. In this case, we
> don't want to unfold anything. The situation is not worse than what we
> have now, and trying to fix it would have repercussions down in the
> buffer, e.g., expanding drawers screen below.
>
> As a rule of thumb, I think we can pay attention to changes in the
> folded text, and its immediate surroundings (e.g., the opening line,
> which is not folded), but no further.
>
> As written above, slight changes are welcome, but let's not go overboard
> and parse a whole section just to know if we can expand a drawer.
>
>> Actually, I am quite unhappy with the performance of modification-hooks
>> set via text properties (I am using this patch on my Emacs during this
>> week). It appears that setting the text properties costs a significant
>> CPU time in practice, even though running the hooks is pretty fast.
>> I will think about a way to handle modifications using global
>> after-change-functions.
>
> That's better, IMO.
>
> I gave you a few ideas to quickly check if a change requires expansion,
> in an earlier mail. I suggest to start out from that. Let me know if you
> have questions about it.

-- 
Ihor Radchenko,
PhD,
Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong 
University, Xi'an, China
Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/02
- Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/02
  - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Bastien, 2020/06/02
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/02
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Bastien, 2020/06/02
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/02
- Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/02
- Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Nicolas Goaziou, 2020/06/05
  - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/05
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Nicolas Goaziou, 2020/06/05
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko <=
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/08
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/08
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Nicolas Goaziou, 2020/06/10
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Ihor Radchenko, 2020/06/21
    - Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers, Nicolas Goaziou, 2020/06/21

Prev by Date: Re: New mailing list archive at https://orgmode/list/
Next by Date: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Previous by thread: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Next by thread: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Index(es):
- Date
- Thread