bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode/UTF-8 support in bash and readline


From: Markus Kuhn
Subject: Unicode/UTF-8 support in bash and readline
Date: Mon, 15 Jan 2001 11:07:31 +0000

Dear bash/readline developer,

One of the recent exciting developments on POSIX platforms (in
particular Linux and XFree86) is that the Unicode (ISO 10646) character
set in the UTF-8 encoding gets more and more widely supported at all
levels. There is now a realistic hope that UTF-8 will have mostly
replaced legacy character encodings (ISO 8859-*, JIS X0208, etc.) on
GNU/Linux systems within the next 24 months.

I would like to invite you to join our group of people dedicated to
enabling UTF-8 support in open source software, because UTF-8 support in
bash is widely seen as a critical milestone before ubiquitous UTF-8
usage becomes feasible.

Most affected in bash is probably the readline editor, because under
UTF-8 the 1 byte = 1 character = 1 terminal column assumption is not
valid any more and C library functions such as mbrtowc() and wcwidth()
should be used to find out, how many columns a string occupies.

If you are interested, you can best get started by reading the "UTF-8
and Unicode FAQ for Unix/Linux" on

  http://www.cl.cam.ac.uk/~mgk25/unicode.html

You should in particular install and make yourself familiar with the new
UTF-8 extensions of xterm and the UTF-8 multi-byte locale support
provided by glibc 2.2.

You might also want to join the linux-utf8@nl.linux.org mailing where
people actively working on UTF-8/Unicode support in various Unix systems
exchange know-how and experiences. To subscribe, send to
majordomo@nl.linux.org a message with the line "subscribe linux-utf8" in
the body.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]