groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: man(7), hyphen, and minus


From: G. Branden Robinson
Subject: Re: man(7), hyphen, and minus
Date: Tue, 13 Dec 2022 14:05:19 -0600

At 2022-12-13T11:33:50-0800, Russ Allbery wrote:
> Just a quick reply on one part of this with more to come later.

Sure, no worries.

> "G. Branden Robinson" <g.branden.robinson@gmail.com> writes:
> > Oh, I know.  I've seen Pod::Man's preamble.  I think what distressed
> > me originally about it was that, like docbook-to-man, it seemed to
> > make man(7) seem like a write-only language.
> 
> This bothers me too, and I made some choices for ease of
> implementation rather than readability of output.  (I generally like
> to prioritize readability of output; my static site generator cares
> more about the readability of the HTML than any sane person probably
> should.)
> 
> The biggest loss there is that I always use font escapes (with
> elaborate workarounds for font strangeness in both Solaris nroff and
> in groff) rather than what any sane human would do, which is use .B,
> .BI, .BR, etc.  The specific problem that I have is that I was trying
> to avoid doing whole-tree transformations on the POD parse tree, so
> the transformation is done locally.  In other words, in something like
> B<< bold I<italic> >>, I first get a function invocation of cmd_i with
> text "italic", and then an invocation of cmd_b with text "bold
> <whatever I turned italic into>".  It's a bit tricky to turn that into
> "\fBbold \fBIitalic\fR" but it doesn't require any state tracking.
> But if I transform I<italic> into ".I italic", life felt rather
> complicated and I wasn't sure if I was going to be able to figure out
> where to go from there,

That's fair, and it isn't the first time I've heard capable people
express the opinion that having a document translator produce idiomatic
man(7) font alternation macro calls rather than chains of font selection
escape sequences was Just Too Damned Hard.  If I could show people how
to do it, I might do so with a swagger, but I confess I can't cash that
check at present.

> particularly because there's a bunch of weird complexity about quote
> escaping required to use the macros.

Here, I know your pain.  I took it upon myself to document this shit.

I like to have parity between groff's Texinfo manual and its man pages
for the "specificationy" bits; that is, I omit gentler tutorial material
and examples from the latter.

But here, when it came time to talk about how to put backslashes and
double quotes in macro arguments in an AT&T-comaptible way, I punted.

groff.7.man:
  If even that is not feasible,
  .\" Nope nope nope--if you're this much of a masochist, go read Texinfo.
  see the \[lq]Calling Macros\[rq] section of the
  .I groff
  Texinfo manual for the complex macro argument quoting rules of AT&T
  .IR troff . \" AT&T
  .\" END Keep (roughly) parallel with groff.texi node "Calling Macros".

Why?

$ info ./build/doc/groff.info | sed -n '/The foregoing raises the question/,$p' 
| less

   The foregoing raises the question of how to embed neutral double
quotes or backslashes in macro arguments when _those_ characters are
desired as literals.  In GNU 'troff', the special character escape
sequence '\[rs]' produces a backslash and '\[dq]' a neutral double
quote.

   In GNU 'troff''s AT&T compatibility mode, these characters remain
available as '\(rs' and '\(dq', respectively.  AT&T 'troff' did not
define these special characters, but any of its descendants can be made
to support them.  *Note Device and Font Description Files::.

   If even that is not feasible, options remain.  To obtain a literal
escape character in a macro argument, you can simply type it if you
change or disable the escape character first.  *Note Using Escape
Sequences::.  Otherwise, you must escape the escape character repeatedly
to a context-dependent extent.  *Note Copy Mode::.

   For the (neutral) double quote, you have recourse to an obscure
syntactical feature of AT&T 'troff'.  Because a double quote can begin a
macro argument, the formatter keeps track of whether the current
argument was started thus, and doesn't require a space after the double
quote that ends it.(2)  (*note Calling Macros-Footnote-2::) In the
argument list to a macro, a double quote that _isn't_ preceded by a
space _doesn't_ start a macro argument.  If not preceded by a double
quote that began an argument, this double quote becomes part of the
argument.  Futhermore, within a quoted argument, a pair of adjacent
double quotes becomes a literal double quote.

     .de eq
     .  tm arg1:\\$1 arg2:\\$2 arg3:\\$3
     .  tm arg4:\\$4 arg5:\\$5 arg6:\\$6
     .. \" 4 backslashes on the next line
     .eq a" "b c" "de"f\\\\g" h""i "j""k"
         error-> arg1:a" arg2:b c arg3:de
         error-> arg4:f\g" arg5:h""i arg6:j"k

   Apart from the complexity of the rules, this traditional solution has
the disadvantage that double quotes don't survive repeated argument
expansion in AT&T 'troff' or GNU 'troff''s compatibility mode.  This can
frustrate efforts to pass such arguments intact through multiple macro
calls.

     .cp 1
     .de eq
     .  tm arg1:\\$1 arg2:\\$2 arg3:\\$3
     .  tm arg4:\\$4 arg5:\\$5 arg6:\\$6
     ..
     .de xe
     .  eq \\$1 \\$2 \\$3 \\$4 \\$5 \\$6
     .. \" 8 backslashes on the next line
     .xe a" "b c" "de"f\\\\\\\\g" h""i "j""k"
         error-> arg1:a" arg2:b arg3:c
         error-> arg4:de arg5:f\g" arg6:h""i

   Outside of compatibility mode, GNU 'troff' doesn't exhibit this
problem because it tracks the nesting depth of interpolations.  *Note
Implementation Differences::.

:-|

I sure hope the reason this was done the way it was because any more
accessible approach ran the PDP-11 out of memory.  Murray Hill's
agonizingly slow adoption of 'aq' and 'dq' special character identifiers
I find difficult to explain given that they bought and paid for a font
that included these glyphs on their very first typesetting device.

And the Autologic APS-5 had them, too (though the ASCII 39 apostrophe
quote was "wrongly" slanted instead of upright by modern standards).

Whither this antipathy for the neutral apostrophe?

> I'm also trying to stay very portable and for a long time I knew there
> were a bunch of proprietary implementations out there that did random
> things (never mind Solaris, what about HP-UX which does some other
> weird things).  So for example I don't use .EE/.EX and instead roll my
> own, which is kind of sad, let alone stuff like .TQ, .UR, or .SY.

I've recently revised groff's an-ext.tmac, which is, and always has
been, permissively licensed, to get most of the groffisms out of it,
even as carefully protected as they were by portability guards.  The
file is now much smaller and, I hope, easy for a novice *roff macro
programmer to understand.

https://git.savannah.gnu.org/cgit/groff.git/tree/tmac/an-ext.tmac

With the last proprietary Unixes finally retiring to their coffins or at
least throwing in the towel on any delusions of troff maintenance, maybe
people will take up some of these conveniences at last.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]