bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new `completion-f


From: Dmitry Gutov
Subject: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new `completion-filter-completions` API and deferred highlighting
Date: Mon, 16 Aug 2021 06:48:32 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 14.08.2021 11:23, João Távora wrote:
Dmitry Gutov <dgutov@yandex.ru> writes:

Aside from the mutability/ownership issue,

On 13.08.2021 15:05, João Távora wrote:
If one removes these lines, the process becomes much faster, but there is a
problem with highlighting.  My idea is indeed to defer highlighting by not
setting the 'face property directly on that shared string, but some
other property
that is read later from the shared string by compliant frontents.

I haven't done any direct benchmarking, but I'm pretty sure that this
approach cannot, by definition, be as fast as the non-mutating one.

Because you go through the whole list and mutate all of its elements,
you have to perform a certain bit of work W x N times, where N is the
size of the whole list.

Let's call W the work that you perform N times in this istuation.  In
the non-mutation, let's call it Z.  So

W <= Z, because Z not only propertizes the string with a calculation of
faces but _also copies its character contents_.

As I pointed out later in the email you're replying to, copying won't happen N times.

Also I think it's better to start about copying rather than mutating.
As Eli points out, putting a text property in a string (my idea) is not
equivalent with "mutating" it.

In common industry terms, that's mutation. Lisp strings are not C strings, they are aggregate objects.

Whereas the deferred-mutation approach will only have to do its bit
(which is likely more work, like, WWW) only 20 times, or 100 times, or
however many completions are displayed. And this is usually
negligible.

I think you're going in the same fallacy you went briefly in the other
bug report.  The flex and pcm styles (meaning
completion-pcm--hilit-commonality) has to go through all the completions
when deciding the score to atribute to each completion that we already
know matches the pattern.  That's because this scoring is essential to
sorting.  That's a given in both scenarios, copying and non-copying.

First of all, note that scoring is only essential to the 'flex' style. Whereas the improvements we're discussing should benefit all, and can be more pronounced if the scoring don't need to be performed.

But ok, let's talk about flex in particular.

Then, it's true that only a very small set of those eventually have to
be displayed to the user, depending on where wants she wants her
scrolling page to be.  So that's when you have to apply 'face' to, say
20 strings, and that can indeed be pretty fast.  But where does the
information come from?

- Currently, it comes from the string's 'face' itself, which was copied
   entirely.

- In the non-copying approach, it must come from somewhere else.  One
   idea is that it comes from a new "private" property 'lazy-face', also
   in the string itselv, but which was _not_ copied.  Another idea is
   just to remember the pattern and re-match it to those 20 strings.

Let's say that the cost to compute the score (on one completion) is S. And the cost to highlight it is H. The cost to copy a string is C (that would be amortized cost, including the subsequent GCs).

The current algorithm costs: N x (C + S + H)

C is unavoidable because of the existing API guarantees.

A non-mutating algorithm can cost:

  N x S (for sorting)

  +

  100 x (C + S + H) (let's say we didn't even cache the scoring info)

...where 100 is the number of displayed completions (the number is usually lower).

As we measured previously, copying is quite expensive. Even in the above, not-caching approach we win ((N - 100) x (C + H)), and, okay, lose 100 x S. For high values of N, it should be a clear win.

I think the second alternative is always faster.

However big the difference is going to be, I can't say in advance, of
course, or whether it's going to be shadowed by some other performance
costs. But the non-mutating approach should have the best optimization
potential when the list is long.

Don't think so.  I'm doing benchmarks, will post soon.

I'm guessing you just skip the C step in your benchmarks? Which is something that breaks our current contract.

Still, Daniel's patch should provide a comparable performance improvement. If you're saying it doesn't give numbers as good, I'll have to give it a more thorough read and testing tomorrow to comment on that.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]