bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wcwidth: work around MacOS X bug


From: Bruno Haible
Subject: wcwidth: work around MacOS X bug
Date: Sat, 7 Jul 2007 23:37:24 +0200
User-agent: KMail/1.5.4

This patch adds a workaround for the MacOS X bug in the wcwidth() function
that was uncovered by Vincent Lefevre, see [1].

[1] http://lists.gnu.org/archive/html/bug-coreutils/2007-01/msg00110.html


2007-07-07  Bruno Haible  <address@hidden>

        Work around MacOS X wcwidth() bug.
        * m4/wcwidth.m4 (gl_FUNC_WCWIDTH): Test against MacOS X 10.3 bug.
        * lib/wcwidth.c: Include localcharset.h, streq.h, uniwidth.h.
        (rpl_wcwidth): Special-case the UTF-8 locales. Fall back to the
        original wcwidth in non-UTF-8 locales.
        * modules/wcwidth (Depends-on): Add localcharset, streq,
        uniwidth/width.
        * doc/functions/wcwidth.texi: Update.

*** doc/functions/wcwidth.texi  1 May 2007 15:11:39 -0000       1.1
--- doc/functions/wcwidth.texi  7 Jul 2007 21:32:50 -0000
***************
*** 11,24 ****
  @item
  This function is missing on some platforms:
  Solaris 2.5.1, mingw, BeOS.
- @end itemize
- 
- Portability problems not fixed by Gnulib:
- @itemize
  @item
  This function handles combining characters in UTF-8 locales incorrectly on 
some
  platforms:
  MacOS X 10.3.
  @item
  On Windows platforms, @code{wchar_t} is a 16-bit type and therefore cannot
  accommodate all Unicode characters.
--- 11,24 ----
  @item
  This function is missing on some platforms:
  Solaris 2.5.1, mingw, BeOS.
  @item
  This function handles combining characters in UTF-8 locales incorrectly on 
some
  platforms:
  MacOS X 10.3.
+ @end itemize
+ 
+ Portability problems not fixed by Gnulib:
+ @itemize
  @item
  On Windows platforms, @code{wchar_t} is a 16-bit type and therefore cannot
  accommodate all Unicode characters.
*** lib/wcwidth.c       7 Jul 2007 20:59:43 -0000       1.2
--- lib/wcwidth.c       7 Jul 2007 21:32:50 -0000
***************
*** 23,30 ****
  /* Get iswprint.  */
  #include <wctype.h>
  
  int
  rpl_wcwidth (wchar_t wc)
  {
!   return wc == 0 ? 0 : iswprint (wc) ? 1 : -1;
  }
--- 23,52 ----
  /* Get iswprint.  */
  #include <wctype.h>
  
+ #include "localcharset.h"
+ #include "streq.h"
+ #include "uniwidth.h"
+ 
+ #undef wcwidth
+ 
  int
  rpl_wcwidth (wchar_t wc)
  {
!   /* In UTF-8 locales, use a Unicode aware width function.  */
!   const char *encoding = locale_charset ();
!   if (STREQ (encoding, "UTF-8", 'U', 'T', 'F', '-', '8', 0, 0, 0 ,0))
!     {
!       /* We assume that in a UTF-8 locale, a wide character is the same as a
!        Unicode character.  */
!       return uc_width (wc, encoding);
!     }
!   else
!     {
!       /* Otherwise, fall back to the system's wcwidth function.  */
! #if HAVE_WCWIDTH
!       return wcwidth (wc);
! #else
!       return wc == 0 ? 0 : iswprint (wc) ? 1 : -1;
! #endif
!     }
  }
*** m4/wcwidth.m4       7 Jul 2007 20:59:43 -0000       1.12
--- m4/wcwidth.m4       7 Jul 2007 21:32:50 -0000
***************
*** 1,4 ****
! # wcwidth.m4 serial 10
  dnl Copyright (C) 2006, 2007 Free Software Foundation, Inc.
  dnl This file is free software; the Free Software Foundation
  dnl gives unlimited permission to copy and/or distribute it,
--- 1,4 ----
! # wcwidth.m4 serial 11
  dnl Copyright (C) 2006, 2007 Free Software Foundation, Inc.
  dnl This file is free software; the Free Software Foundation
  dnl gives unlimited permission to copy and/or distribute it,
***************
*** 35,40 ****
--- 35,78 ----
  
    if test $ac_cv_func_wcwidth = no; then
      REPLACE_WCWIDTH=1
+   else
+     dnl On MacOS X 10.3, wcwidth(0x0301) (COMBINING ACUTE ACCENT) returns 1.
+     dnl This leads to bugs in 'ls' (coreutils).
+     AC_CACHE_CHECK([whether wcwidth works reasonably in UTF-8 locales],
+       [gl_cv_func_wcwidth_works],
+       [
+         AC_TRY_RUN([
+ #include <locale.h>
+ /* AIX 3.2.5 declares wcwidth in <string.h>. */
+ #include <string.h>
+ /* Tru64 with Desktop Toolkit C has a bug: <stdio.h> must be included before
+    <wchar.h>.
+    BSD/OS 4.0.1 has a bug: <stddef.h>, <stdio.h> and <time.h> must be included
+    before <wchar.h>.  */
+ #include <stddef.h>
+ #include <stdio.h>
+ #include <time.h>
+ #include <wchar.h>
+ #if !HAVE_DECL_WCWIDTH
+ extern
+ # ifdef __cplusplus
+ "C"
+ # endif
+ int wcwidth (int);
+ #endif
+ int main ()
+ {
+   if (setlocale (LC_ALL, "fr_FR.UTF-8") != NULL)
+     if (wcwidth (0x0301) > 0)
+       return 1;
+   return 0;
+ }], [gl_cv_func_wcwidth_works=yes], [gl_cv_func_wcwidth_works=no],
+           [gl_cv_func_wcwidth_works="guessing no"])
+       ])
+     case "$gl_cv_func_wcwidth_works" in
+       *yes) ;;
+       *no) REPLACE_WCWIDTH=1 ;;
+     esac
    fi
    if test $REPLACE_WCWIDTH = 1; then
      AC_LIBOBJ([wcwidth])
*** modules/wcwidth     7 Jul 2007 20:59:44 -0000       1.8
--- modules/wcwidth     7 Jul 2007 21:32:51 -0000
***************
*** 10,15 ****
--- 10,18 ----
  Depends-on:
  wchar
  wctype
+ localcharset
+ streq
+ uniwidth/width
  
  configure.ac:
  gl_FUNC_WCWIDTH





reply via email to

[Prev in Thread] Current Thread [Next in Thread]