[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS

From: Bruno Haible
Subject: Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS
Date: Fri, 27 Jul 2018 11:38:22 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-130-generic; KDE/5.18.0; x86_64; ; )

Paul Eggert wrote:
> my earlier patch 
> neglected the possibility that mbrtowc can return 0

I wouldn't see this as a bug: You can assume that mbrtowc returns
0 if and only if the multibyte sequence is a NUL byte - but you had
chosen srcend in such a way that this would not happen in the loop.

> and it incorrectly assumed 
> wide control characters always have a single-byte representation.

Oops, you're right. My mistake as well.

The new patch looks good.

This will catch (and replace with '?') U+2028 and U+2029 on glibc systems.
On macOS, it will not do this, because iswcntrl(0x2028) and iswcntrl(0x2029)
is 0 on this system; this is consistent with the fact that the 'Terminal'
program displays these characters as simple spaces. So, no need to override
iswcntrl on macOS.


2018-07-27  Bruno Haible  <address@hidden>

        iswcntrl: Mention minor problem on macOS.
        * doc/posix-functions/iswcntrl.texi: Mention oddity on macOS.

diff --git a/doc/posix-functions/iswcntrl.texi 
index 99eaa0e..44dd034 100644
--- a/doc/posix-functions/iswcntrl.texi
+++ b/doc/posix-functions/iswcntrl.texi
@@ -25,4 +25,8 @@ Portability problems not fixed by Gnulib:
 On AIX and Windows platforms, @code{wchar_t} is a 16-bit type and therefore 
 accommodate all Unicode characters.
+This function returns 0 for U+2028 (LINE SEPARATOR) and
+U+2029 (PARAGRAPH SEPARATOR) on some platforms:
+Mac OS X 10.13.
 @end itemize

reply via email to

[Prev in Thread] Current Thread [Next in Thread]