emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: master ea93326: Add `union' and `intersection' to rx (bug#37849)


From: Mattias Engdegård
Subject: Re: master ea93326: Add `union' and `intersection' to rx (bug#37849)
Date: Fri, 13 Dec 2019 18:43:47 +0100

13 dec. 2019 kl. 18.13 skrev Stefan Monnier <address@hidden>:

>    (rx-define-charset ident-chars "a-z0-9")
>    (rx-define-charset op-chars (union "+*/" ?-))
> 
> and then
> 
>     ... (any (union ident-chars op-chars)) ...
> 
> Is it worth the trouble?

Not if a special set of definition constructs are required, no.

> We could try and automatically determine inside `or` when regexp-opt can
> be used.  That doesn't sound like much fun, tho.

That is done now, to some extent. Some history, for those not keeping up:

(or STRING1 STRING2 ...) did use regexp-opt for some time, but it had the 
unfortunate effect that the match order wasn't preserved, which made the 
outcome unpredictable. For example, if the target string is "abc", then (or "a" 
"ab") would match "ab", but (or "a" "ab" digit) would match "a".

The code was then changed to only use regexp-opt when it doesn't affect 
observed behaviour, which is what we have today.

However, typically the longest match is desired, not the first. The user can 
order his strings carefully before putting them inside an 'or', but that (1) 
doesn't compose well, (2) is error-prone, (3) useless manual work that is done 
by regexp-opt anyway.

Something like (longest-or STRING...) has been proposed, but was not well 
received. A solution is still sought (preferably one that doesn't involve 
rewriting the regexp engine).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]