[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cp-patches] RFC: gnu.regexp: fixed bugs in RETokenOneOf
From: |
Ito Kazumitsu |
Subject: |
[cp-patches] RFC: gnu.regexp: fixed bugs in RETokenOneOf |
Date: |
Sun, 22 Jan 2006 11:50:05 +0900 (JST) |
There is a testcase in gnu/testlet/java/util/regex/Pattern/testdata2
which looks like
/(ab|ab*)bc/
abc
0: abc
1: a
The problem here is that once "ab" matches /ab/ and /ab*/,
the backtrackig to "a" which matches /ab*/ does not occur
and tho whole matching fails.
And here is my fix.
ChangeLog
2006-01-22 Ito Kazumitsu <address@hidden>
* gnu/regexp/REToken.java(empty): Made Cloneable.
* gnu/regexp/RETokenOneOf.java(match): RE.java(match):
Use separate methods matchN and matchP depending on the
boolean negative.
(matchN): New method used when negative. Done as before.
(matchP): New method used when not negative. Each token is
tried not by itself but by a clone of it.
Index: classpath/gnu/regexp/REToken.java
===================================================================
RCS file: /cvsroot/classpath/classpath/gnu/regexp/REToken.java,v
retrieving revision 1.2
diff -u -r1.2 REToken.java
--- classpath/gnu/regexp/REToken.java 2 Jul 2005 20:32:15 -0000 1.2
+++ classpath/gnu/regexp/REToken.java 22 Jan 2006 02:37:20 -0000
@@ -38,12 +38,21 @@
package gnu.regexp;
import java.io.Serializable;
-abstract class REToken implements Serializable {
+abstract class REToken implements Serializable, Cloneable {
protected REToken next = null;
protected REToken uncle = null;
protected int subIndex;
+ public Object clone() {
+ try {
+ REToken copy = (REoken) super.clone();
+ return copy;
+ } catch (CloneNotSupportedException e) {
+ throw new Error(); // doesn't happen
+ }
+ }
+
protected REToken(int subIndex) {
this.subIndex = subIndex;
}
Index: classpath/gnu/regexp/RETokenOneOf.java
===================================================================
RCS file: /cvsroot/classpath/classpath/gnu/regexp/RETokenOneOf.java,v
retrieving revision 1.2
diff -u -r1.2 RETokenOneOf.java
--- classpath/gnu/regexp/RETokenOneOf.java 2 Jul 2005 20:32:15 -0000
1.2
+++ classpath/gnu/regexp/RETokenOneOf.java 22 Jan 2006 02:37:20 -0000
@@ -71,52 +71,58 @@
}
boolean match(CharIndexed input, REMatch mymatch) {
- if (negative && (input.charAt(mymatch.index) ==
CharIndexed.OUT_OF_BOUNDS))
+ return negative ? matchN(input, mymatch) : matchP(input, mymatch);
+ }
+
+ private boolean matchN(CharIndexed input, REMatch mymatch) {
+ if (input.charAt(mymatch.index) == CharIndexed.OUT_OF_BOUNDS)
return false;
REMatch newMatch = null;
REMatch last = null;
REToken tk;
- boolean isMatch;
for (int i=0; i < options.size(); i++) {
tk = (REToken) options.elementAt(i);
REMatch tryMatch = (REMatch) mymatch.clone();
if (tk.match(input, tryMatch)) { // match was successful
- if (negative) return false;
+ return false;
+ } // is a match
+ } // try next option
+
+ ++mymatch.index;
+ return next(input, mymatch);
+ }
- if (next(input, tryMatch)) {
- // Add tryMatch to list of possibilities.
- if (last == null) {
- newMatch = tryMatch;
- last = tryMatch;
- } else {
- last.next = tryMatch;
- last = tryMatch;
- }
- } // next succeeds
+ private boolean matchP(CharIndexed input, REMatch mymatch) {
+ REMatch newMatch = null;
+ REMatch last = null;
+ REToken tk;
+ for (int i=0; i < options.size(); i++) {
+ tk = (REToken)((REToken) options.elementAt(i)).clone();
+ tk.chain(this.next);
+ tk.setUncle(this.uncle);
+ tk.subIndex = this.subIndex;
+ REMatch tryMatch = (REMatch) mymatch.clone();
+ if (tk.match(input, tryMatch)) { // match was successful
+ if (last == null) {
+ newMatch = tryMatch;
+ last = tryMatch;
+ } else {
+ last.next = tryMatch;
+ last = tryMatch;
+ }
} // is a match
} // try next option
if (newMatch != null) {
- if (negative) {
- return false;
- } else {
- // set contents of mymatch equal to newMatch
+ // set contents of mymatch equal to newMatch
- // try each one that matched
- mymatch.assignFrom(newMatch);
- return true;
- }
+ // try each one that matched
+ mymatch.assignFrom(newMatch);
+ return true;
} else {
- if (negative) {
- ++mymatch.index;
- return next(input, mymatch);
- } else {
- return false;
- }
+ return false;
}
-
- // index+1 works for [^abc] lists, not for generic lookahead (--> index)
}
void dump(StringBuffer os) {
- [cp-patches] RFC: gnu.regexp: fixed bugs in RETokenOneOf,
Ito Kazumitsu <=