emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: master 544db1e: Faster grep pattern for identifiers


From: Mattias Engdegård
Subject: Re: master 544db1e: Faster grep pattern for identifiers
Date: Wed, 15 Sep 2021 18:29:31 +0200

15 sep. 2021 kl. 17.56 skrev Eli Zaretskii <eliz@gnu.org>:

> Doesn't this change the semantics of the "word"?  The Grep notion of
> the word is not necessarily identical to that of Emacs, since the
> latter depends on the major mode.  The comment in the deleted code
> says that much, AFAICT.  Or what am I missing?

Sorry, I should have written a more descriptive commit message.

First of all, there is no risk for false positives because the grep output is 
filtered for occurrence of the sought identifier in post-processing. Thus, the 
only correctness risk is for false negatives.

The effect of -w is to reject matches with a word char immediately before or 
after a match. This is exactly what the previous glued-on regexps did.

Both the old and new approaches are sound with respect to the programming 
languages they are used for, because what grep considers to be word chars are 
alphanumeric characters (as determined by the locale) and underline. Thus, a 
false negative would require an identifier to occur immediately before or after 
such a character, and the lexical rules for supported languages don't allow 
that.

There could be exceptions. For example, ancient Smalltalk used _ as assignment 
operator because Xerox's character set was based on the 1963 ASCII draft where 
that code was used for a left-pointing arrow. That wouldn't work with our 
scheme, now or before.

One might wonder why we use -w at all given the post-processing. It reduces the 
grep output so that the post-processor isn't overwhelmed by false positives: 
consider a search for the identifier `i`. That said, -w has a nonzero cost, so 
omitting it for searches of identifiers above a certain length is likely to be 
advantageous, especially when the grep tool is slow. We haven't doe that at 
this time.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]