[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
BUG? RFE? printf lacking unicode support in multiple areas
From: |
Linda Walsh |
Subject: |
BUG? RFE? printf lacking unicode support in multiple areas |
Date: |
Fri, 20 May 2011 00:31:31 -0700 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Thunderbird/2.0.0.24 Mnenhy/0.7.6.666 |
It appears printf in bash doesn't support unicode
characters in a couple of ways:
1) use of of the \uXXXX and \UXXXXXXXX escape sequences
in the format string (16 and 32 bit Unicode values).
2) It doesn't handle the "%lc" conversion to print out wide
characters. To demonstrate this I created a wide char for a
double exclamation mark U+203C, using a=$'0x3c\0x20' and then
tried to print "$a".
From the list of supported formats, %lc should be valid
as in the sprintf function:
c If no l modifier is present, the int argument is converted
to an
unsigned char, and the resulting character is written. If
an l
modifier is present, the wint_t (wide character)
argument is
converted to a multibyte sequence by a call to the
wcrtomb(3)
function, with a conversion state starting in the initial
state,
and the resulting multibyte string is written.
The gnu version of printf handles the \uXXXX and \UXXXXXXXX
version, but doesn't appear to handle the "%lc" format specifier.
I.e. /usr/bin/printf "\u203c" will print out the double exclamation mark
on a tty that is using a font with it defined (like "Lucida Console").
It's not horribly vital but I noticed it wasn't supported when looking
at character support in filenames...
- BUG? RFE? printf lacking unicode support in multiple areas,
Linda Walsh <=