emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pattern matching on match-string groups #elisp #question


From: Mattias Engdegård
Subject: Re: Pattern matching on match-string groups #elisp #question
Date: Sat, 27 Feb 2021 19:10:59 +0100

27 feb. 2021 kl. 15.39 skrev Stefan Monnier <monnier@iro.umontreal.ca>:

> Nevertheless, I went ahead with this change (after remembering that
> wrapping the code in `ignore` should eliminate the extra warnings).

So where does that leave us with the rx pattern? There's still the interleaved 
match data problem, which I've tried to address below.

> It's clearly The Right Thing™.

Perhaps it is; a proposed diff is attached below which treats zero and one 
variable specially and uses a list followed by immediate destructuring for >1 
variables. (By the way, using a backquote form to generate backquote forms is 
annoying.)

My guess is that a vector may be faster than a list if there are more than N 
elements, for some N.

>> My guess is that a vector may be faster than a list if there are more than N 
>> elements, for some N.
> 
> I'll let you benchmark it to determine the N.

I now have, and am sad to say that a list is always faster for any practical 
number of N (I didn't bother trying more than 30) although the difference 
narrows as N grows. This is despite the destructuring code becoming 
considerably bigger for lists (as we get a long chain of tests and branches) 
than for vectors. It all boils down to vector construction being more expensive 
than lists.

Maybe we should pack N>1 variables into N-1 cons cells by using the last cdr 
(improper list), but list* is no primitive so it may be a loss for N>M, for 
some M>2.

> currently `string-match-p` is ever so slightly
> slower than `string-match` and since we clobber the match data in other
> cases, we might as well clobber the match data in this case as well: any
> code which presumes the match data isn't affected by some other code
> which uses regular expressions is quite confused.

Right; I'm sticking to string-match for the time being.

> I don't think it's much more complicated than your current constant
> folding: when you see a let-binding of a variable to a *constructor*,
> stash that expression in your context as a "partially known constant"
> and then do the constant folding when you see a matching *destructor*.

Doable, but definitely not low-hanging fruit. Since pcase has made a dog's 
breakfast of the destructuring code it's not straightforward to recognise it as 
such in the optimiser. Efforts needed elsewhere first!

> go back to the last option it tried and accept it even though it failed
> to match.  It still sucks, but maybe it'll give someone else a better idea?

Sounds like pcase--dead-end would fit then, at least as an internal name.

Or pcase--Sherlock-Holmes.

Attachment: rx-pcase-list-destructure.diff
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]