bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

BUG? RFE? printf lacking unicode support in multiple areas


From: Linda Walsh
Subject: BUG? RFE? printf lacking unicode support in multiple areas
Date: Fri, 20 May 2011 00:31:31 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Thunderbird/2.0.0.24 Mnenhy/0.7.6.666


It appears printf in bash doesn't support unicode
characters in a couple of ways:

1) use of of the \uXXXX and \UXXXXXXXX escape sequences
in the format string (16 and 32 bit Unicode values).

2) It doesn't handle the "%lc" conversion to print out wide
characters.  To demonstrate this I created a wide char for a
double exclamation mark U+203C, using a=$'0x3c\0x20' and then
tried to print "$a".


From the list of supported formats, %lc should be valid
as in the sprintf function:

c If no l modifier is present, the int argument is converted to an unsigned char, and the resulting character is written. If an l modifier is present, the wint_t (wide character) argument is converted to a multibyte sequence by a call to the wcrtomb(3) function, with a conversion state starting in the initial state,
             and the resulting multibyte string is written.


The gnu version of printf handles the \uXXXX and \UXXXXXXXX
version, but doesn't appear to handle the "%lc" format specifier.

I.e. /usr/bin/printf "\u203c" will print out the double exclamation mark
on a tty that is using a font with it defined (like "Lucida Console").

It's not horribly vital but I noticed it wasn't supported when looking at character support in filenames...






reply via email to

[Prev in Thread] Current Thread [Next in Thread]