bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: large integer truncation in regex module


From: Paul Eggert
Subject: Re: large integer truncation in regex module
Date: Sat, 26 May 2012 23:59:07 -0700
User-agent: Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120430 Thunderbird/12.0.1

On 05/26/2012 01:24 PM, Steven M. Schweda wrote:

>    On VAX (a 32-bit system), it's fatal.

That's worse than a warning, and I can see similar problems
might happen on other 32-bit-only systems, so we should fix it.

>> Generally speaking we prefer 'if (xxx)' to '#if xxx' where
>> either will do, because the former is easier to read and
>> reason about.
> 
> ... Ever since I started programming
> computers, the goal was always a program which worked correctly

Yes, and that's why I wrote "where either will do".

I installed the following patch, which I hope fixes the problem.
It's a tad cleaner anyway, as now there's only one instance
of the magic hex pattern.

---
 ChangeLog     |    9 +++++++++
 lib/regcomp.c |   16 ++++++++++------
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1252100..1676ab9 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2012-05-26  Paul Eggert  <address@hidden>
+
+       regex: don't assume uint64_t or uint32_t
+       * lib/regcomp.c (init_word_char): Don't assume that the types
+       uint64_t and uint32_t exist.  The C standard doesn't guarantee
+       them, and on some 32-bit compilers there is no uint64_t.
+       Problem reported by Gianluigi Tiesi in
+       <http://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00154.html>.
+
 2012-05-25  Jim Meyering  <address@hidden>
 
        maint.mk: add strncpy-prohibiting syntax-check rule
diff --git a/lib/regcomp.c b/lib/regcomp.c
index b51a9a6..7996dc0 100644
--- a/lib/regcomp.c
+++ b/lib/regcomp.c
@@ -956,18 +956,22 @@ init_word_char (re_dfa_t *dfa)
   int ch = 0;
   if (BE (dfa->map_notascii == 0, 1))
     {
+      bitset_word_t bits0 = 0x00000000;
+      bitset_word_t bits1 = 0x03ff0000;
+      bitset_word_t bits2 = 0x87fffffe;
+      bitset_word_t bits3 = 0x07fffffe;
       if (BITSET_WORD_BITS == 64)
        {
-         dfa->word_char[0] = UINT64_C (0x03ff000000000000);
-         dfa->word_char[1] = UINT64_C (0x07fffffe87fffffe);
+         dfa->word_char[0] = bits1 << 31 << 1 | bits0;
+         dfa->word_char[1] = bits3 << 31 << 1 | bits2;
          i = 2;
        }
       else if (BITSET_WORD_BITS == 32)
        {
-         dfa->word_char[0] = UINT32_C (0x00000000);
-         dfa->word_char[1] = UINT32_C (0x03ff0000);
-         dfa->word_char[2] = UINT32_C (0x87fffffe);
-         dfa->word_char[3] = UINT32_C (0x07fffffe);
+         dfa->word_char[0] = bits0;
+         dfa->word_char[1] = bits1;
+         dfa->word_char[2] = bits2;
+         dfa->word_char[3] = bits3;
          i = 4;
        }
       else
-- 
1.7.6.5





reply via email to

[Prev in Thread] Current Thread [Next in Thread]