bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH][TRY2] fix false multi-byte matches in some regular expressio


From: Stanislav Brabec
Subject: Re: [PATCH][TRY2] fix false multi-byte matches in some regular expressions
Date: Mon, 27 Feb 2012 16:42:00 +0100

Paul Eggert wrote:

> Stanislav, if you find a short and reproducible
> test case, please cc: it to bug-gnulib so that I can
> add it to the gnulib test cases too.

Comparing sed and C structures, I found additional condition to
reproduce this bug:
To reproduce this problem in a pure C language with strings I have, the
code has to initialize fastmap. (sed does, grep probably does not, awk
probably does.) I did not try to research, whether it is possible to
reproduce the problem without fastmap.

Use of fastmap creates something like "match candidates" that are
analyzed later.

This patch contains C language testcase:
http://sourceware.org/bugzilla/show_bug.cgi?id=13637#c4

Note that glibc contains another 32 regex test cases. If you do not want
to import them all (and their skeleton), It's easy to modify C language
test to become standalone:
- Delete last 3 lines.
- Change static int do_test to int main.
- And to compile outside regex, remove "regex_internal.h" and replace
  SBC_MAX by 1 << (sizeof (char) * 8).

> Also, if you figure out what's causing sed to loop,
> that'd be even better: quite possibly it's another bug
> in the regex code, one that we should squash.

The loop was caused in the sed code. str_append() seems to be a sed
specific function, I didn't find anything similar in the regex code or
libc. This bug is independent on the regex bug, a regression that
appears only in the latest versions of sed, when incomplete multi-byte
character is processed. (The code compared unsigned integer < 0
expecting to catch -2).

-- 
Best Regards / S pozdravem,

Stanislav Brabec
software developer
---------------------------------------------------------------------
SUSE LINUX, s. r. o.                          e-mail: address@hidden
Lihovarská 1060/12                            tel: +49 911 7405384547
190 00 Praha 9                                  fax: +420 284 028 951
Czech Republic                                    http://www.suse.cz/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]