[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fixing ill-conditioned regular expressions. Proof of concept.
From: |
Stefan Monnier |
Subject: |
Re: Fixing ill-conditioned regular expressions. Proof of concept. |
Date: |
Fri, 13 Mar 2015 18:53:38 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) |
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ;;
> ;; f i x - r e . e l
This is not using the standard file format. I strongly suggest you use
something like auto-insert-mode which will get that right for you.
> ;; The AST is an internal representation of the regular expression string.
> ;; Elements are represented as follows:
Why invent a new format, instead of using rx? I'm pretty sure someone
has already provided a parser from string to rx format (and of course
we also already have a dumper from rx to string).
> ;; PTR and AD
> ;; ----------
> ;; Whilst building and transforming trees, what we really need is pointers to
> ;; either the car or cdr of a cons cell, to enable us to
> ;; overwrite them.
I think your code will gain a lot in readability if you made it
functional: take an rx expression and return a *new* rx expression
without modifying the original one.
As it stands, I find it rather impenetrable.
> Return 'shortened, t or nil.
Our conventions say this should be
Return `shortened', t or nil.
> (if (eq ad 'cdr) (error "fix-re--R+R*->R+ got 'cdr"))
> (let ((e0 (if (eq ad 'car) (car ptr) (cadr ptr)))
> (e1 (if (eq ad 'car) (cadr ptr) (caddr ptr))))
> (when (and (consp e0) (consp e1))
> (let ((op0 (car e0))
> (op1 (car e1))
> ;; (link (if (eq ad 'car) (cddr ptr) (cdddr ptr)))
> op
> )
> (when
> (and (memq op0 '(+ * \?))
> (memq op1 '(+ * \?))
> (equal (cdr e0) (cdr e1))) ; Both Rs are the same.
> (cond
> ((and (eq op0 '\?) (eq op1 '\?)) ; Cant combine R?R?
> nil)
> ((and (eq op0 '+) (eq op1 '+)) ; R+R+ -> RR+
> (fix-re--chop-+* ptr 'car)
> t)
> (t
> (setq op
> (cond
> ((or (eq op0 '+) (eq op1 '+)) '+)
> ((or (eq op0 '*) (eq op1 '*)) '*)))
> (setcar e0 op)
> (fix-re--chop (if (eq ad 'car) ptr (cdr ptr)) 'cadr)
> 'shortened))
> )))))
For example, I think that if you make it functional rather than
imperative, the above could turn into something along the lines of
(pcase ptr ;We don't have `ad' any more.
(`((,op0 . ,r0)
(,op1 . ,r1)
. (and ,rest
(guard (and (equal r0 r1)
(memq op0 '(+ * \?))
(memq op1 '(+ * \?)))))))
(cond
((and (eq op0 '\?) (eq op1 '\?)) ; Cant combine R?R?
ptr)
((and (eq op0 '+) (eq op1 '+)) ; R+R+ -> RR+
`(,(car r0) ,(car ptr) . ,rest))
(t
`((,(cond
((or (eq op0 '+) (eq op1 '+)) '+)
((or (eq op0 '*) (eq op1 '*)) '*)) . r0)
. ,rest)))))
-- Stefan>
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: Fixing ill-conditioned regular expressions. Proof of concept.,
Stefan Monnier <=