bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Assertion fail report: src/parse-gram.c:3268


From: Akim Demaille
Subject: Re: Assertion fail report: src/parse-gram.c:3268
Date: Tue, 3 Aug 2021 10:48:43 +0200

Hi,

> Le 29 avr. 2021 à 17:03, Ahcheong Lee <ahcheong.lee@gmail.com> a écrit :
> 
> Hello,
> I report an assertion failure test input generated during a fuzzing test.
> 
> src/parse-gram.c:3268: char *unquote(const char *): Assertion `*cp == '"''
> failed.
> 
> I test on the latest revision of bison uploaded on git.
> You can reproduce the failure by
> ./bison <test input>

Thanks for the bug report.  Its cause is visibly a copy-paste error.  I 
probably copied a \ coming from the terminal where the line was wrapped.

Cheers!

commit 4ee711d4f4f692dbd5133e867ea3364baecf52ff
Author: Akim Demaille <akim.demaille@gmail.com>
Date:   Tue Aug 3 10:19:37 2021 +0200

    scan: fix typo in UTF-8 escape
    
    We had:
    
    ```
    -mbchar    ...|\xF0[\x\90-\xBF]([\x80-\xBF]{2})|...
    +mbchar    ...|\xF0[\x90-\xBF]([\x80-\xBF]{2})|...
    ```
    
    so a precise sequence that matches the incorrect regex can let NUL
    bytes pass through, which triggers an assertion violation downstream.
    It is a pity that Flex does not report an error for such input.
    
    Reported by Ahcheong Lee <ahcheong.lee@gmail.com>.
    <https://lists.gnu.org/r/bug-bison/2021-04/msg00003.html>
    
    * src/scan-gram.l (mbchar): Fix the bad regex.
    * tests/input.at (Invalid inputs): Check that case.

diff --git a/src/scan-gram.l b/src/scan-gram.l
index 160bda622..f55429ede 100644
--- a/src/scan-gram.l
+++ b/src/scan-gram.l
@@ -160,7 +160,7 @@ xint      0[xX][0-9abcdefABCDEF]+
 eol       \n|\r\n
 
  /* UTF-8 Encoded Unicode Code Point, from Flex's documentation. */
-mbchar    
[\x09\x0A\x0D\x20-\x7E]|[\xC2-\xDF][\x80-\xBF]|\xE0[\xA0-\xBF][\x80-\xBF]|[\xE1-\xEC\xEE\xEF]([\x80-\xBF]{2})|\xED[\x80-\x9F][\x80-\xBF]|\xF0[\x\90-\xBF]([\x80-\xBF]{2})|[\xF1-\xF3]([\x80-\xBF]{3})|\xF4[\x80-\x8F]([\x80-\xBF]{2})
+mbchar    
[\x09\x0A\x0D\x20-\x7E]|[\xC2-\xDF][\x80-\xBF]|\xE0[\xA0-\xBF][\x80-\xBF]|[\xE1-\xEC\xEE\xEF]([\x80-\xBF]{2})|\xED[\x80-\x9F][\x80-\xBF]|\xF0[\x90-\xBF]([\x80-\xBF]{2})|[\xF1-\xF3]([\x80-\xBF]{3})|\xF4[\x80-\x8F]([\x80-\xBF]{2})
 
 /* Zero or more instances of backslash-newline.  Following GCC, allow
    white space between the backslash and the newline.  */
diff --git a/tests/input.at b/tests/input.at
index c639436f7..5f298bc9a 100644
--- a/tests/input.at
+++ b/tests/input.at
@@ -83,7 +83,8 @@ AT_CLEANUP
 AT_SETUP([Invalid inputs])
 
 AT_DATA([input.y],
-[[\000\001\002\377?
+[[%header "\360\000\200\210"
+\000\001\002\377?
 "\000"
 %%
 ?
@@ -98,37 +99,41 @@ AT_PERL_REQUIRE([[-pi -e 's/\\(\d{3})/chr(oct($1))/ge' 
input.y]])
 AT_BISON_CHECK([-fcaret input.y], [1], [], [stderr])
 
 # Autotest's diffing, when there are NUL bytes, just reports "binary
-# files differ".  So don't leave NUL bytes.
-AT_PERL_CHECK([[-p -e 's{([\0\377])}{sprintf "\\x%02x", ord($1)}ge' stderr]], 
[],
-[[input.y:1.1-2: error: invalid characters: '\0\001\002\377?'
-    1 | \x00\xff?
+# files differ".  So don't leave NUL bytes.  And don't leave invalid
+# mbchars either: escape raw binary.
+AT_PERL_CHECK([[-p -e 's{([\0\200\210\360\377])}{sprintf "\\x%02x", 
ord($1)}ge' stderr]], [],
+[[input.y:1.11: error: invalid null character
+    1 | %header "\xf0\x00\x80\x88"
+      |           ^
+input.y:2.1-2: error: invalid characters: '\0\001\002\377?'
+    2 | \x00\xff?
       | ^~
-input.y:2.2: error: invalid null character
-    2 | "\x00"
+input.y:3.2: error: invalid null character
+    3 | "\x00"
       |  ^
-input.y:4.1: error: invalid character: '?'
-    4 | ?
+input.y:5.1: error: invalid character: '?'
+    5 | ?
       | ^
-input.y:5.14: error: invalid character: '}'
-    5 | default: 'a' }
+input.y:6.14: error: invalid character: '}'
+    6 | default: 'a' }
       |              ^
-input.y:6.1: error: invalid character: '%'
-    6 | %&
+input.y:7.1: error: invalid character: '%'
+    7 | %&
       | ^
-input.y:6.2: error: invalid character: '&'
-    6 | %&
+input.y:7.2: error: invalid character: '&'
+    7 | %&
       |  ^
-input.y:7.1-17: error: invalid directive: '%a-does-not-exist'
-    7 | %a-does-not-exist
+input.y:8.1-17: error: invalid directive: '%a-does-not-exist'
+    8 | %a-does-not-exist
       | ^~~~~~~~~~~~~~~~~
-input.y:8.1: error: invalid character: '%'
-    8 | %-
+input.y:9.1: error: invalid character: '%'
+    9 | %-
       | ^
-input.y:8.2: error: invalid character: '-'
-    8 | %-
+input.y:9.2: error: invalid character: '-'
+    9 | %-
       |  ^
-input.y:9.1-10.0: error: missing '%}' at end of file
-    9 | %{
+input.y:10.1-11.0: error: missing '%}' at end of file
+   10 | %{
       | ^~
 ]])
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]