bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fix ctype usage bugs in readline


From: Eric Blake
Subject: fix ctype usage bugs in readline
Date: Fri, 23 Jan 2015 17:35:35 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0

On platforms where 'char' is signed, using ctype functions on
'char' arguments produces undefined behavior for arguments that
are larger than 127.  In particular, on Cygwin, isspace((char)255)
MUST return the same 0 result as isspace(EOF), while there are some
single-byte locales in which isspace((unsigned char)255) returns 1.

POSIX is explicit that the ctype functions are only well-defined on
the range of EOF plus the values of unsigned char.

Compilation on Cygwin with -Wall intentionally flags suspicious use of
suspect conditions (I've tried in the past to also get glibc to flag
such suspicious usage, but so far no one has listened to me):

vi_mode.c: In function 'rl_vi"fword':
vi_mode.c:557:7: error: array subscript has type 'char'
[-Werror=char-subscripts]
        if (_rl_isindent (rl_line_buffer[rl_point]))
        ^

and many other similar warnings.

The following patch silences the particular warnings in vi_mode.c,
although I have not yet audited ALL places in the bash/readline source
code to see if there is any further abuse of ctype functions called with
out-of-range inputs.  Furthermore, my patch takes liberty with
IN_CTYPE_DOMAIN - note that the only places where I used to_uchar were
in macros where EOF as input should validly return 0, but since
isspace((unsigned char)EOF) might return non-zero, I took the shortcut
that because all of those macros were gated by IN_CTYPE_DOMAIN, I could
let IN_CTYPE_DOMAIN filter out EOF.  Of course, my hack is a gross
misnomer (EOF _is_ in the ctype domain - it is the only integer value
outside the range of unsigned char that is afforded that standing).  So
you may want to do a more thorough audit of any place that you are
explicitly passing an 'int' which could be either EOF or an unsigned
char value, rather than assuming that these macros will only ever be
used on 'char', and/or change the naming or other aspects of how the
gating is done.

diff --git i/lib/readline/chardefs.h w/lib/readline/chardefs.h
index 1fa1b08..7cb8575 100644
--- i/lib/readline/chardefs.h
+++ w/lib/readline/chardefs.h
@@ -1,6 +1,6 @@
 /* chardefs.h -- Character definitions for readline. */

-/* Copyright (C) 1994-2009 Free Software Foundation, Inc.
+/* Copyright (C) 1994-2009, 2015 Free Software Foundation, Inc.

    This file is part of the GNU Readline Library (Readline), a library
    for reading lines of text with interactive input and history editing.
@@ -67,13 +67,14 @@
 #define UNCTRL(c) _rl_to_upper(((c)|control_character_bit))

 #if defined STDC_HEADERS || (!defined (isascii) && !defined (HAVE_ISASCII))
-#  define IN_CTYPE_DOMAIN(c) 1
+#  define IN_CTYPE_DOMAIN(c) (c != EOF)
 #else
 #  define IN_CTYPE_DOMAIN(c) isascii(c)
 #endif
+#define to_uchar(c) ((unsigned char) (c))

 #if !defined (isxdigit) && !defined (HAVE_ISXDIGIT) && !defined
(__cplusplus)
-#  define isxdigit(c)   (isdigit((c)) || ((c) >= 'a' && (c) <= 'f') ||
((c) >= 'A' && (c) <= 'F'))
+#  define isxdigit(c)   (isdigit((to_uchar (c))) || ((c) >= 'a' && (c)
<= 'f') || ((c) >= 'A' && (c) <= 'F'))
 #endif

 #if defined (CTYPE_NON_ASCII)
@@ -87,13 +88,13 @@

 /* Beware:  these only work with single-byte ASCII characters. */

-#define ISALNUM(c)     (IN_CTYPE_DOMAIN (c) && isalnum (c))
-#define ISALPHA(c)     (IN_CTYPE_DOMAIN (c) && isalpha (c))
-#define ISDIGIT(c)     (IN_CTYPE_DOMAIN (c) && isdigit (c))
-#define ISLOWER(c)     (IN_CTYPE_DOMAIN (c) && islower (c))
-#define ISPRINT(c)     (IN_CTYPE_DOMAIN (c) && isprint (c))
-#define ISUPPER(c)     (IN_CTYPE_DOMAIN (c) && isupper (c))
-#define ISXDIGIT(c)    (IN_CTYPE_DOMAIN (c) && isxdigit (c))
+#define ISALNUM(c)     (IN_CTYPE_DOMAIN (c) && isalnum (to_uchar (c)))
+#define ISALPHA(c)     (IN_CTYPE_DOMAIN (c) && isalpha (to_uchar (c)))
+#define ISDIGIT(c)     (IN_CTYPE_DOMAIN (c) && isdigit (to_uchar (c)))
+#define ISLOWER(c)     (IN_CTYPE_DOMAIN (c) && islower (to_uchar (c)))
+#define ISPRINT(c)     (IN_CTYPE_DOMAIN (c) && isprint (to_uchar (c)))
+#define ISUPPER(c)     (IN_CTYPE_DOMAIN (c) && isupper (to_uchar (c)))
+#define ISXDIGIT(c)    (IN_CTYPE_DOMAIN (c) && isxdigit (to_uchar (c)))

 #define _rl_lowercase_p(c)     (NON_NEGATIVE(c) && ISLOWER(c))
 #define _rl_uppercase_p(c)     (NON_NEGATIVE(c) && ISUPPER(c))

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]