bug-ncurses
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Which string capabilities need script interpreation?


From: Thomas Dickey
Subject: Re: Which string capabilities need script interpreation?
Date: Mon, 24 Aug 2020 16:39:21 -0400
User-agent: NeoMutt/20170113 (1.7.2)

On Mon, Aug 24, 2020 at 12:53:22PM +0200, Florian Weimer wrote:
> * Thomas Dickey:
> 
> > On Sun, Aug 23, 2020 at 09:33:59PM +0200, Florian Weimer wrote:
> >> * Thomas Dickey:
> >> 
> >> > On Sun, Aug 23, 2020 at 04:40:18PM +0200, Florian Weimer wrote:
> >> >> I'm trying to figure which string capabilities in terminfo files need
> >> >> to run through the script interpreter during output.  Is this
> >> >> information available somewhere?  Type information for the script
> >> >> parameters would also be nice.
> >> >
> >> > When you say "script", I'm thinking of the command-line program
> >> > by that name.  It's not an interpreter -- it simply records the
> >> > characters sent to the terminal.
> >> 
> >> I mean the string capabilities with printf-style % directives in them,
> >> sorry.  In my world, these things look like small scripting languages.
> >
> > there's simple expressions (arithmetic, logic, formatting),
> > but no loops.  Long ago, I heard the saying that any interesting
> > program has one I/O statement, one loop and one bug.  So terminfo
> > is missing the second requirement...
> >
> > man 5 terminfo has this section summarizing the operations:
> >
> >    Parameterized Strings
> 
> Yes, that was quite helpful when writing my parser and interpreter.
> 
> >> > In terminfo, you have literal strings with some features
> >> > added (parameter-substitution, simple expressions and padding).
> >> > The tparm function takes those strings along with the actual parameters,
> >> > and generates a string that (still containing padding)
> >> > can be sent to the terminal using tputs.
> >> 
> >> And I'm wondering if it is possible, based on the capability as such,
> >> to tell whether it takes parameters, and what their types are.  It
> >> helps with adding consistency checks.
> >
> > tic's checking tries to determine if the capability uses parameters,
> > and if they're the proper types.  That works well for the predefined
> > capabilities (though as usual, there may be some expression that the
> > analyzer doesn't get right).
> 
> I see, I now see expected_params and line_capability in the tic
> sources.
> 
> Still it doesn't tell us the types (string vs integer), and whether
> there are no parameters.

If there's no %s, the parameters have to be numeric.

Numeric parameters have (as you see) some quirks since a lot of
the terminfo was translated from termcap.
 
> To illustrate the challenges I see, here are a few examples.
> 
> rtpc has this:
> 
>   tsl=\Ej\EY@%+ \Eo,
> But tic lists tsl as just taking one parameter, so %+ cannot work in
> this context even with implicit parameter pushing.

hmm - yes (zero plus anything doesn't change much).

When I have some usable documentation (and notice a problem),
I do try to repair these.

For this

        rtpc|ibmapa16|IBM 6155 Extended Monochrome Graphics Display,

I'd look here

        http://www.bitsavers.org/pdf/ibm/pc/rt/

(but I don't see anything promising)

That was (in 1995)

rtpc|ibmapa16|ibm6155|IBM 6155 Extended Monochrome Graphics Display, 
        lines#32, 
        dsl=\Ej\EY@ \EI\Ek, tsl=\Ej\EY@%+ \Eo, use=ibmmono,

However, my copy of the AIX 3 and 4 terminfo has no matching string
for an IBM terminal.  Interestingly, I see a similar string in the h19:

        tsl=\Ej\Ex5\EY8%p1%' '%+%c\Eo\Eo,

(looking at the beginning and ending of it).  I suspect something this was
meant:

        tsl=\Ej\EY@%p1%' '%+%c\Eo,

but without knowing whether that's the whole story, I can't be sure.

> aaa+rv has this:
> 
>    sgr=\E[%?%p2%t4;%;%?%p4%t5;%;%?%p6%t1;%;%?%p1%p2%|%p3%!%t7;
>      %;%?%p7%t8;%;m\016,

actually it looks okay to me:

        sgr=\E[
                %?
                        %p2
                        %t4;
                %;
                %?
                        %p4
                        %t5;
                %;
                %?
                        %p6
                        %t1;
                %;
                %?
                        %p1%p2%|%p3%!
                        %t7;
                %;
                %?
                        %p7
                        %t8;
                %;
                m\016,

http://www.bitsavers.org/pdf/annArborTerminals/Ann_Arbor_Ambassador/

> 
> I think this does not result in a balanced stack.  These in
> ndr9500 have this problem, too:
> 
>   pfloc=\E|%{48}%p1%+%c2%p2\031,
>   pfx=\E|%{48}%p1%+%c1%p2\031,

These are almost the same.  It's adding ASCII "0" (48) to the parameter,
and printing the result as a character.  That "2" or "1" after the "%c"
seems to be a literal value.

Its sgr looks kind of ugly (I'd have to ask if that "%%%%" (becoming "%%"
in output) is translated properly:

        sgr=\EG0\E%%%%\E(
                %?
                        %p1%p5%p8%|%|
                        %t\E)
                %;
                %?
                        %p9
                        %t\E$
                %;,

(but documentation is lacking...)
   
> And ti916:
> 
>   cup=\E[%p1%i%p1%d;%p2%dH,

That "%p1%i" looks misplaced.

Interestingly, this mentions ti916

http://osr600doc.sco.com/en/man/html.M/terminals.M.html

but my copy of SCO's terminfo lacks it.

That came from this change (July 1996):

# 9.13.11:
#       * Added t916 entry, translated from a termcap in SCO's support area.

But the full header gives a clue:

ti916|ti916-220-7|Texas Instruments 916 VDT 8859/1 vt220 mode 7 bit CTRL,

and infocmp suggests a change (eliminate the first "%p1"):

        cup: '\E[%p1%i%p1%d;%p2%dH', '\E[%i%p1%d;%p2%dH'.

(fix applied now)

> tek4107 uses %! throughout, but I think it's actually to be sent
> verbatim.  sgr0 is particularly clear in this regard, I think:
> 
>   sgr0=\E%!1\E[m$<2>\E%!0,
> 
> sgr0 does not take any parameters, but may need to access variables,
> as can be seen in wy350:
> 
>   sgr0=\EG0\E(\EH\003%{0}%PA%{0}%PC,
> 
> So I think %! in sgr0 needs to be quoted.

I see - perhaps another check in tic, to warn about quoting that may be
needed for output-strings that should use escaping -- along with making
tputs remove the escapes.  The existing codes (including Solaris) kind
of assume that strings that don't take parameters aren't processed with
tparm, making them ready for use in tputs.
 
> I do not really know what is wrong with this icl6404 entry:
> 
>   csr=\E!%+%p1%{32}%+%p2%{32},

nor I, offhand - it's unchanged from ESR in 1998.
bitsavers has no relevant manuals.

This looks promising

https://invisible-mirror.net/archives/shuford/terminal//icl_terminals_news.txt

but comparing with the other stuff, I'd expect to see some final-character.
It is documented though:

        ESC ! p1 p2     define scroll region:
                        p1 = scroll top    line:  20h - 37h
                        p1 = scroll bottom line:  20h - 37h

This seems to be the intended string:

    csr=\E!%p1%{32}%+%c%p2%{32}%+%c,

so the "%+" was misplaced, and there was no output (fix applied now)

> > For user-defined capabilities (other than the ones that ncurses defines
> > in its database as extensions -- I added a table of those "recently"),
> > that won't work :-)
> 
> New user-defined capabilities could perhaps encode the expected
> parameter types in their names (similar to C++ name mangling).

ah.... I could digress
 
> > well... hardware terminals wouldn't do that instaneously,
> > but there's no data being buffered or blocked -- so it wouldn't
> > have been a big problem.
> 
> There are instances of sgr0 that also have a delay in them, and it's
> not flagged with P, either.

well, there's documentation and there's reality :-)

Running tic's checking option, I've about 10 thousand lines of listing.

Occasionally (as in icl6404) I've an opportunity to improve something
just based on documentation, but more often it's because I find a new check
to add, to highlight inconsistencies that show errors to investigate/fix.
 
> > $<...> are padding (time-delays)
> >
> > and yes, since strings are fed into tparm as data, those could
> > be a nuisance.  Since tputs/putp are supposed to accept either
> > the result from tparm/tiparm _or_ a string from the terminal
> > database, it would be in-scope to improve the %s part by escaping
> > the dollar-signs (and backslash-characters, of course).
> 
> I'm wondering if it would be okay in an alternative implementation to
> only honor $<…> that already part of the terminfo entry.  Basically
> recognize it at the same time as the % directives, and ignore whatever
> strings come in via %s (or other constructs), or strings that are
> written directly.

At the moment I'm curious if making tputs handle some escapes would be enough.
(To see, I'll have to add some checks to tic, to see how many exceptions to
the rule would pop up).
 
> >> This would be a problem for a “set title” capability, if such a thing
> >> existed.  (It looks like for hpterm, “pfkey” is used in this sense.)
> >
> > The "hpterm" description has this:
> >
> > pfkey=\E&f%p1%dk%p2%l%dL%p2%s,
> >
> > which is (tersely) documented like this:
> >
> >        pkey_key                      pfkey      pk        program function 
> > key
> >                                                           #1 to type string 
> > #2
> > pfkey=\E&f%p1%dk%p2%l%dL%p2%s,
> >           ^#1   ^#2     ^#2
> >
> > That \E is ASCII escape, which I've changed in a sample output to "\e"
> >
> > \e&f5k5Lhello
> >
> > otherwise generated using
> >
> >     tput -T hpterm pfkey 5 hello
> 
> For context, I was going with this:
> 
>   <https://tldp.org/HOWTO/Xterm-Title-6.html#ss6.5>

fwiw, I've had no influence on that :-)
 
> Which suggests to use this very sequence to change the terminal title.
> 
> > tput uses tparm and tputs, which uses the first parameter in the "5"
> > that appears first (using %p1 and %d), and the second parameter twice
> > (%p2 and %ld and %d, for a length, and %p2 and %s for a string).
> 
> But %p2%l will produce incorrect results if %p2 contains the string
> $<…> and padding is processed because the $<…> is not written to the
> terminal, so the character count produced by %l is too high.

yes... but no one's brought up the issue before, so there's no previous
attempt to solve the problem.

(of course, developers who format with tparm and output with printf or
who format with sprintf and ignoring tputs present an unsolvable problem)

-- 
Thomas E. Dickey <dickey@invisible-island.net>
https://invisible-island.net
ftp://ftp.invisible-island.net

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]