[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Patch for lookaround assertion in regexp
From: |
Tomohiro MATSUYAMA |
Subject: |
Patch for lookaround assertion in regexp |
Date: |
Thu, 4 Jun 2009 08:04:25 +0900 |
Hi, all
I have attached a patch that enables you to
use lookaround assertion in regexp
with following syntax:
* Positive lookahead assertion
\(?=...\)
* Negative lookahead assertion
\(?!...\)
* Positive lookbehind assertion
\(?<=...\)
* Negative lookbehind assertion
\(?<!...\)
Basically, it works as same as Perl's one.
Spec:
* Any pattern is allowed in lookahead assertion.
* Nested looaround assertion is allowed.
* Capturing is allowed only in positive lookahead/lookbehind assertion.
* Duplication is allowed after such assertion.
* Variable length pattern is NOT yet allowed in lookbehind assertion.
[x] \(?<=[0-9]+\)MB
[o] \(?<=[0-9][0-9][0-9][0-9]\)MB
* Lookahead assertion over start bound is not allowed in re-search-backward.
(re-search-backward "\(?<=a\)b") for buffer "abca_|_b"
will seek to first "ab".
As of performace, I think there is no problem about lookahead assertion,
but lookbehind assertion is somewhat high cost.
You can check this patch works properly with a testcase I have attached
and also see performance:
src/emacs --script regex-test.el perf
I saw that lookbehind assertion will spend 5 times than usual lookbehind alike
regexp. I think I have to improve its performance.
Anyway, please try it and review it.
And if like it, please merge it.
I believe that some people really want to use it.
Regards,
MATSUYAMA Tomohiro
regex-test.el
Description: Binary data
emacs-regex.patch
Description: Binary data
- Patch for lookaround assertion in regexp,
Tomohiro MATSUYAMA <=