[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [O] ...
From: |
Carsten Dominik |
Subject: |
Re: [O] ... |
Date: |
Thu, 31 Jan 2013 12:59:29 +0100 |
Hi Bastien,
as you know, regular expressions are a language to do a programmed search for
text. The pattern string has to be compiled before it can be used. That
compilation is a costly process, so most languages that have pattern matching
use some kind of cache to store compiled patterns, so that frequently used
patterns can be reused without compilation.
I am aware of this very much from studying perl. In perl, a compiled pattern
is associated with a particular instance of a string. Often you build the
pattern by constructing it through concatenation of other parts etc. In Perl
this means that the pattern is recompiled each time a match. You can work
around this issue in Perl by telling it explicitly and on programmers authority
that, "yes, this pattern is dynamically constructed, but only once, I guarantee
that it will not change, so compile it only once". So in Perl the difference is
/pattern/ will match against pattern
/$pattern/ will match agains the pattern contained in the
variable $pattern, and recompilation will occur
each time
/$pattern/o will compile only once and trust the programmer.
So I am very aware of this speedup issue. And I thought that in Emacs, the
caching would work by associating a specific string object with the compiled
pattern. But the code Christopher pointed out seems to suggest that the
pattern cache works also for strings that are `equal', not only for string that
are `eq'.
If this is the case, this means that there is only a very small difference
between
(defconst my-pattern (concat "^" "xyz"))
(re-search-forward my-pattern ....) ; many times in different functions
and
(defconst my-partial-pattern "xyz")
(re-search-forward (concat "^" my-partial-pattern) ....) ; many times
The difference is only the repeated concatenation operation, and not the
recompilation. I always thought that this would work differently, and that is
why a lot of regexps get constructed and then stored in variables or constants.
Of course this is also a good practice for readable and maintainable code, but
the impact on efficiency is not as big as I used to think. So when I saw
Christoher's initial patch, I thought a function to create
org-ooutline-regexp-bol would be a large burden in speed - but it now seems
that it would only be a minor impact.
Still, I think making a local variable in buffers with org-struct-mode is also
a good way to get the functionality Christopher wants.
Clearer?
- Carsten
On 31 jan. 2013, at 12:22, Bastien <address@hidden> wrote:
> Hi Carsten and Christopher,
>
> Carsten Dominik <address@hidden> writes:
>
>> I mant to copy the list, I am doing this again now.
>>
>> Wow, I was not aware that Emacs caches by content, this is an important
>> piece of information. I guess this removed the main concern I had. Thanks
>> for looking it up in the code and showing it to me. I am not sure if I
>> understand that code completely, but i trust your judgment.
>
> I'm not sure I have all the background to understand the issue at
> stake... can anyone educate me? Thanks!
>
> --
> Bastien
--
There is no unscripted life. Only a badly scripted one. -- Brothers Bloom