[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: term, utf-8 and cooked mode, combining characters

From: Marcus Brinkmann
Subject: Re: term, utf-8 and cooked mode, combining characters
Date: Wed, 18 Sep 2002 10:08:28 +0200
User-agent: Mutt/1.4i


please apologize if I try to defend the libreadline idea a bit more.  I am kinda
excited about the possibility to get all applications transparently use
libreadline if they use cooked mode.  I might be pipe-dreaming...

On Wed, Sep 18, 2002 at 01:46:15AM -0400, Roland McGrath wrote:
> readline would need some work to have a different backend hooking into the
> term bottom half instead of the POSIX termios world it's written for.  I
> fear it's also kind of big to have in term, and I worry it might not be as
> robust in all ways we would want term to be.

It is big, but most code is for the configuration file parsing, key
binding handling, emacs/vi cod, etc.  I was thinking about making its use
optional (term --readline or whatever) and use the old code otherwise.

With regards to readline's depenency on termios, that is really not that
much.  There is some signal handling, which you already can disable.  There
is the winsize stuff, which we can disable (we can feed it with our idea of
the size) or just leave in, because the bottom handler (the console server)
does the right thing.  Then there is the code that prepares and resets the
terminal state.  This seems like one or two hooks we need to feed it with
our idea of the control characters etc, and make it not to try to do
anything fancy with the underlying "terminal" (which is just the console).

What I am more concerned about is organizing the input correctly.  libreadline
seems to like to do its own buffering and select loop.  I am not sure if that is
easily compatible with term, but the "alternate interface" which uses
callbacks seems to be promising.  Maybe we need one additional hook to make
it not use a file descriptor, but some more direct interface to enter input.

It's entirely possible I am ignorant of something very important here, like
handling control characters correctly in cooked mode.

> The real reason to use readline is for all its fancy features, i.e.
> full-featured line-editting in cooked mode as you get in bash et al.  That
> requires a lot of configuration to make it reasonable, and is also a
> distraction from what you were really interested in.

I think I am not aware of what configuration you mean.  I agree that it is
probably not reasonable to have filename tab completion by default in term's
cooked mode, but the default settings should be good enough for normal
interactive mode.
> For the multibyte issue, console already knows all about the characters.
> So it can naturally dtrt if the term functionality is built in via
> libtermserver.  That seems like the righter thing.

The console does not know about single characters written to term.  It
just converts the UTF-8 stream from the client via an iconv into the local
encoding, and outputs that.  It has no idea what composes a single character
in the local encoding.  In fact, I don't know how to implement this given
the iconv interface.  libreadline uses the mb* functions which work on the
local encoding only and intimately know it.  I guess we could make console
run in this local encoding, but this will be a pain if we want to support
local encodings per virtual console (which is easy to do right now).  (This
would of course also be a problem with a libreadline using libtermserver in
the console).

Also, I am not sure I agree it is the right thing.  Sure, if you put term
into the console, it doesn't make no difference from the outside.  But that
doesn't solve the problem that you have if you want to use the stand alone
term server in any multi-byte encoding locale.  So, although that would
do the job here, it doesn't seem to be the generic solution.

> seems like the right thing in a multibyte-aware terminal is that no
> multibyte char can be a special character since there is no termios
> interface for setting a char to match one (there is just c_cc).  Otherwise
> if there are multibyte chars whose nth bytes might match a byte set in
> c_cc, a spurious signal or editting feature will be invoked.

I agree.  All sane encodings are ASCII compatible in the lower 7bit range. 
I don't even want to think about anything that isn't.
> The mention of readline along with all this makes me think of an incidental
> console feature that would be nice once console uses libtermserver.  There
> should be a hook/protocol feature by which the console server notifies
> clients of term state changes like cooked mode.  Then a client can
> e.g. present a different UI for cooked mode (ICANON) input vs raw input.
> For example, using libreadline with history records and so forth for cooked
> mode, and a more direct input interface when not in cooked mode.

Interesting, although I am not sure that would be easy to make convenient to
use in the GUI.  libreadline in term seems to be more natural for me, as
this will make it behave exactly like bash etc.  Programs that use readline
etc use raw mode anyway.


`Rhubarb is no Egyptian god.' GNU      http://www.gnu.org    marcus@gnu.org
Marcus Brinkmann              The Hurd http://www.gnu.org/software/hurd/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]