bug#43100: 28.0.50; pcase not binding variables conditionally

bug-gnu-emacs
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#43100: 28.0.50; pcase not binding variables conditionally

From:	Pip Cet
Subject:	bug#43100: 28.0.50; pcase not binding variables conditionally
Date:	Sat, 5 Sep 2020 14:36:50 +0000
On Wed, Sep 2, 2020 at 2:16 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> I just want the complexity to be predictable without having to guess
> >> which optimization will be applicable.
> > So it's better to have a predictably-bad `append' rather than a
> > sometimes-bad one?
> In my book, yes.

That's a very unusual book! This isn't real-time programming or
crypto, so I fail to see the damage "faster is better" might cause
here.

> > But how does that affect ,@? We could make that predictably bad, too!
>
> I think the expectation (coming from the use of ,@ in non-patterns)
> would be that it is fairly cheap, but yes, I guess it's an option.
> Could be coupled with a warning when ,@ can be replaced by an efficient
> `. ,`.

I don't think we should warn about that; we should simply do the best
we can do, and warn the user when that's surprisingly bad (i.e. when
we have to iterate over the list building sublists).

> >> >> More specifically, I'd rather not choose a semantics that imposes
> >> >> duplicating the branch body, since we have no control over its size and
> >> >> that can hence lead to potential code size explosion.
> >> > You're right, and it's a good thing that the duplication of the branch
> >> > body is a fixable implementation detail rather than something imposed
> >> > by the semantics.
> >> It's only fixable if you disregard the "more or less" above.
> >> I find it to be a pretty bad wrinkle in the semantics.
> > So setq the outer binding if it changed?
> Oh, yuck!  That'd be adding injury to insult.

Would work, though!

> >> >> The design of `pcase` assumes you want to optimize away the tests that
> >> >> are common to the various patterns.  That can't be done with dynamic
> >> >> patterns.
> >> > Or it's a bit more difficult, at least...
> >> I think it's more than "a bit more difficult", because deciding how to
> >> optimize the tests will almost always take significantly more time than
> >> just doing the tests.  So in order to do it dynamically and be useful,
> >> you still need to have 2 stages, where you optimize at one stage and
> >> then use the result several times later (to recoup the cost of the
> >> optimization).
> > Wait, I think there was a misunderstanding here. I don't mean that the
> > pcase pattern should depend structurally on let-bound variables
> > appearing in it. That does sound impossible to me (except with dynamic
> > scoping).
>
> To me "dynamic patterns" means patterns which are only know at run time
> because the pattern itself depends on a runtime value, as in something like:
>
>     (let ((pat (if FOO '(pred symbolp) '1)))
>       ...
>       (pcase V
>         ...
>         ((match pat) EXP)
>         ...))
>
> Not sure what you meant by it, then.

I didn't mean anything by "dynamic patterns" because I didn't use the term :-)

What I meant was using SYMBOL as short-hand for (pred (equal SYMBOL))
when it's not in a white-list of symbols to be bound instead of
compared. That was the intention of my example. I think dynamic
patterns, as you use the term, should be possible (IIUC, they're not,
with lexical binding), but punitively slow (I don't think we have to
worry about that last part...), but that's an entirely different
discussion.

> >> We have `help-function-arglist`, so it's not that hard.
> >> But it's not guaranteed to work.
> > Oh, thanks for pointing that out.  It's not very good, though, is it?
>
> It's good enough for `C-h o`.

C-h o says

(+ &rest NUMBERS-OR-MARKERS) and (1+ NUMBER)

help-function-arglist says

(&rest rest) and (arg1)

The latter is much less useful, IMHO.

> >> I'm sorry, I don't understand what those sexps (and their «optional"
> >> non-final parts») have to do with pcase's handling of `or` patterns
> >> where the different branches don't bind the same set of variables.
> > Well, maybe I'm just missing an obvious way of doing it.
> Could be, but I have no idea what "it" is.

Matching something like Emacs defuns, which may have a docstring or
not, I'd like to do

`(defun ,symbol ,@(or `(,(and (pred stringp) docstring))) `())  . ,body)

It seems wasteful to have two almost-identical patterns to match the
variants, and I'd like docstring to have a useful default value.

> >> >> > disallowing the modification of name in X.
> >> >> That's rather hard to do (and I don't see what would be the benefit 
> >> >> here).
> >> > I meant adding a cautionary note about it in the documentation, not
> >> > actively preventing it.
> >> So we're replacing the current "don't do that" with another "don't do
> >> that", neither of which is detected by the byte-compiler?
> > Yes. Emacs Lisp isn't free of "don't do that"s, and reducing the
> > "don't do that" space drastically is better than pointing out the tiny
> > fraction of it which remains as evidence that we shouldn't do the
> > reduction in the first place.
>
> I don't see your proposal as reducing the "don't do that" space.

Currently, it's "don't bind variables only in some branches of a pcase
or". With my proposal, it's "feel free to bind variables only in some
branches of a pcase or, but don't modify them". It's a strict subset
relationship.

> >> >> The current implementation amounts to "we should signal an error but we
> >> >> don't bother doing so and just warn against it in the manual".
> >> >> Patch welcome ;-)
> >> > You mean a patch that would make pcase less powerful by making what I
> >> > want to do impossible rather than merely difficult?
> >> The way I see it, it would make what you want to do possible because
> >> what you suggest would merely give meaning to programs previously invalid.
> >> In contrast, with the current behavior your proposal implies breaking
> >> backward compatibility.
> > I think what you mean is you'd rather break backward compatibility in
> > two steps rather than one, by first causing errors, then redefining
> > the new errors not to happen? Because if so, I totally agree.
>
> That's right.  The first (breaking) step would be "my fault" and would
> hence free you to do the second step without introducing
> incompatibilities ;-)

Hooray! Let's do that, then.

> > IOW, I'd like ` (as a pcase macro) to behave like ` already does:
> > constant-time `(,@l), linear-time `(,@l 3).
>
> I suggest you start with a `pip-backquote` pcase-macro and experiment
> with it, to see how useful it is.  You'll then presumably have more
> concrete examples to argue for the replacement of the current `
>
> > And, again, performance of ` is unpredictable unless you know the
> > details of the optimization. Should we get rid of ` next?

> AFAIK performance is O(N) where N is the number of cons cells in the
> output (minus those cons-cells which are preserved as-is, I guess but
> that doesn't make much of a difference).

It means (,@l) is O(1), but (,@l 3) is O(N).

> That seems quite different
> from going from O(1) to O(N²).

So going from O(1) to O(N) is okay, but O(N^2) is not? I think it
would be okay to throw a "use append!" warning, or even an error, if
we encounter a situation in which there is more than one ,@ without a
maximum length (i.e. situations in which we fail to be O(N), given
constant-time predicates).

> >> But that presumes a C implementation of `pcase--check-length` since
> >> (< (length X) 4) can be a very bad idea when X is a long list.
> > Even a Lisp implementation that isn't a wrapper around (length X)
> > would avoid unnecessary predicates.
>
> Indeed.  I won't be the one writing the code which tries to guess when
> that's more efficient and when it's less efficient.

Hmm. After running some benchmarks, it seems like a C implementation
of check-length is slightly faster, a Lisp implementation of
check-length is slightly slower, there's less code (in terms of cons
cells) generated with check-length, but the bytecode looks better
without it. All that is without any predicates and in the case of a
matching pattern, so I'd say it's a wash then. In theory, of course,
it's easy to come up with an algorithm that uses both approaches to
achieve minimum complexity, but that's difficult to implement.

So my tentative proposal would be to use `append' or ,@ if you want
length reasoning, plain ` and , if you don't. It might be simpler
always to use it, but I don't want (pcase x (a b) (cons a b)) to emit
a function call.

> > (I think it's interesting that you can't evaluate "proper" pcase at
> > run time; at least, I don't see how you'd implement a pcase-dyn
> > function that evaluates its second argument etc. before interpreting
> > it. It's easy to do with dynamic binding. I think that's because our
> > implementation of lexical binding is too limited, but I'm not sure.)
>
> I don't understand what your `pcase-dyn` would be expected to do, so
> I don't know how dynamic scoping might coming into the picture.

It's what you called dynamic patterns above.

The problem is you cannot eval a pcase form and achieve lexical
binding of the variables used in the bindings. There's no
lexical-scoping equivalent of

(eval `(pcase ',exp ,bindings))

I expect you'll object that that's usually not what you want, anyway,
which is correct but leaves the unusual cases where you know that the
pcase will have been memoized and thus be fairly cheap.

> > The strong limitation is "you can only add new pcase patterns if they
> > have predictable complexity (in code size, run time, or memory usage,
> > presumably)".
>
> Not at all.  You can define any pcase pattern you like with
> `pcase-defmacro`, just like you can define any function or macro you
> like with `defun` and `defmacro`.

Well, I wish :-)

pcase-defmacro is limited to non-recursive patterns.

Thanks for clarifying that the kingdom of predictable complexity does
not extend to pcase-defmacro.

> > By solving polynomial roots? I think it's better to use
> > (pcase-defmacro * ...) for a Kleene star macro...which it turns out
> > isn't precisely trivial to implement with lexical scoping.
>
> Why would lexical scoping make a difference?

Because pcase patterns cannot be recursive (or can they?), but you can
recurse, given dynamic scoping, by eval'ing a pcase form.

By the way, I'm also quite confused by this comment:

;; - implement (not PAT).  This might require a significant redesign.

(pcase-defmacro not (pat)
  `(app (lambda (expval) (not (pcase expval (,pat t))))
        (pred identity)))

would work, wouldn't it? Or did I totally misunderstand what a pcase
not would be expected to do?
[Prev in Thread]
Current Thread
[Next in Thread]
bug#43100: 28.0.50; pcase not binding variables conditionally, Pip Cet, 2020/09/02
- bug#43100: 28.0.50; pcase not binding variables conditionally, Stefan Monnier, 2020/09/02
  - bug#43100: 28.0.50; pcase not binding variables conditionally, Pip Cet <=
    - bug#43100: 28.0.50; pcase not binding variables conditionally, Stefan Monnier, 2020/09/05
Prev by Date: bug#13521: `sort-lines' on the empty region
Next by Date: bug#41905: Small change: In Customize, display name with with spaces, but keep symbol name underneath.
Previous by thread: bug#43100: 28.0.50; pcase not binding variables conditionally
Next by thread: bug#43100: 28.0.50; pcase not binding variables conditionally
Index(es):
- Date
- Thread