groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using tbl(1) for structure definitions


From: G. Branden Robinson
Subject: Re: Using tbl(1) for structure definitions
Date: Thu, 11 Aug 2022 16:46:12 -0500

At 2022-08-11T15:47:38+0200, Ingo Schwarze wrote:
> Alejandro Colomar wrote on Tue, Jul 26, 2022 at 10:09:44PM +0200:
> > I must say that the source code is really ugly (ugly as in,
> > someone reading it will probably have a hard time modifying it,
> > without reading tbl(1)).
> 
> Completely true, but that's not the worst aspect of it.

I disagree with Ingo's priorities here.  The readability of the source
is more important for document maintainability.  As we shall see, tbl(1)
need not discard as much as Ingo suggests, and even if it does (at
present), I don't perceive quite the semantic damage he does.

> In a nutshell, you are making it impossible to decently render
> the manual page to HTML or to convert it to other formats in
> any sensible way.
> 
> If esr@ (of doclifter fame) were still around, he would be screaming
> in pain and disgust.

We have an open bug report requesting a feature to have tbl emit HTML.

https://savannah.gnu.org/bugs/index.php?60052

Maybe someone would like to work on this.  The "troffcvt" suite already
did this many years ago.  I further note that GNU eqn(1) has already for
many years supported MathML output, and Brian Kernighan reportedly
altered pic(1) to produce SVG.  This undertaking was characterized as
"not difficult".  I imagine it was GNU pic he extended in this way,
given his statements about groff in his most recent book _Unix: A
History and Memoir_, and because GNU pic was written from the start to
support different output formats (supplanting tpic for production of TeX
pictures from pic input).

If we could knock those two projects out, summer-of-code style, we could
eliminate a lot of grohtml.  Possibly the entire pre-grohtml
preprocessor could be disposed of.

> > But at the same time, the result is beautiful,
> 
> Only in PDF and PostScript output.

These output formats are _how typesetting is done_ in the modern era.

I know mandoc doesn't want to dirty its hands with such matters, but
your militance about the unimportance of typesetting blinkers your
perspective.

groff cannot share that perspective.

"As the most widely deployed implementation of troff in use today, groff
holds an important place in the Unix universe.  Frequently and
erroneously dismissed as a legacy program for formatting Unix manuals
(manpages), groff is in fact a sophisticated system for producing
high-quality typeset material, from business correspondence to complex
technical reports and plate-ready books." -- groff Mission Statement,
2014

https://www.gnu.org/software/groff/groff-mission-statement.html

> Did we really learn nothing collectively from the "do not abuse
> tables for layout purposes" drama that raged for decades among HTML
> authors and HTML standard developers?  Exactly the same applies to
> manual pages.

Tables were abused in HTML for layout for a similar reason Alex is
tempted to use tbl here; the suggested "portable" dialect you and I both
prescribe for man(7) pages does not admit any other way for him to
achieve what he wants.  In HTML, at first CSS didn't even exist, and its
progress toward permitting the simulation of full-color typography with
pixel-precise element placements in Web pages was difficult and fitful.
I stopped messing with Web stuff before that journey was completed, if
indeed it was.  I have heard it said that the preferred way to satisfy
such demands these days is use a "canvas" element and run JavaScript on
the client side to draw it.

Anyway, what Alex wants isn't stupid or gratuitous.

> In fact, the same arguments so very familiar from HTML apply to manual
> pages even more. HTML tables can at least be imbued with some semantic
> capabilities by using CSS, whereas tbl(1) tables are so deeply
> entrenched in the "markup is presentational markup only" camp that
> they can never hope to convey any semantic function at all.

tbl could be extended to support semantic tagging easily.

We could add a column modifier, say "g", which would take a mandatory
parenthesized argument with the semantic tag to be applied to the
column.

.TS
tab(@);
Lg(type) Lg(identifier) Lg(descriptive-comment).
int@nflag;@/* ??? */
.TE

How to carry this semantic information into the output is the more
interesting question.

Yet I would hasten to point out that a synopsis that presents something
that is nowhere discussed later in the man page makes the document
deficient.  So if you have semantic markup of all relevant content
_after_ the synopsis, of which a well-written mdoc(7) page will surely
boast, then little or nothing is lost in the domains of  searchability
and discoverability.

> There is no need to use bold or italic donts in the structure
> display.  Making all the C code bold merely makes the whole
> display look heavy and ugly and provides no additional
> information.  Making the comments use mixed font looks even
> more ugly and is also redundant because the constants are
> hopefully already more fully documented elsewhere.

I'm going to invite you again to consult your bookshelf of programming
texts.  All of the practices you deprecate above are common in typeset
works in the field.

> I don't think you need to worry that the alignment might vary on
> different output devices.  If you worry anyway, you can use an
> explicit roff(7) .ta request before the display and reset it with .DT
> after the display.

"If esr@ (of doclifter fame) were still around, he would be screaming in
pain and disgust."

  commit 26e827e36a4d98e9a9403bcc73b4afb116495407
  Author:     Eric S. Raymond <esr@thyrsus.com>
  AuthorDate: Sat Feb 3 11:54:41 2007 +0000

      Added portability advice.

  diff --git a/tmac/groff_man.man b/tmac/groff_man.man
  index e314cd8f6..057f9f9d4 100644
  --- a/tmac/groff_man.man
  +++ b/tmac/groff_man.man
[...]
  +.\" -----------------------------------------------------------------
  +.
  +.SH "PORTABILITY AND TROFF REQUESTS"
  +.
  +Since the
  +.B man
  +macros consist of groups of
  +.I groff
  +requests, one can, in principle, supplement the functionality of the
  +.B man
  +macros with individual
  +.I groff
  +requests where necessary.  See the
  +.I groff
  +info pages for a complete reference of all requests.
  +.LP
  +Note, however, that using raw troff requests is likely to make your page
  +render poorly on the (increasingly common) class of viewers that
  +render it to HTML.  Troff requests make implicit assumptions about
  +things like character and page sizes that may break in an HTML
  +environment; also, many of these viewers don't interpret the full
  +troff vocabulary, a problem which can lead to portions of your
  +text being silently dropped.
  +.LP
  +For portability to modern viewers, it is best to write your page
  +entirely in the requests described on this page. Further, it is best
  +to completely avoid those we have described as 'presentation-level'
  +.RB ( HP ,
  +.BR PD ,
  +and
  +.BR DT ).

The above has undergone some recasting but the basic thrust is intact.
We can dial back its stridency in a considered way, if you like.  Shall
we discuss it?

Alternatively, earlier in the thread (which started on the "linux-man"
list) I proposed extending the `SY` macro to accept additional
arguments.  The ones past the first would not be formatted, but measured
to set tab stops corresponding to their width (plus a small horizontal
space).  `YS` would restore the default tab stops.  I did also discuss
some drawbacks to this approach, and proposed a few others.

  https://lore.kernel.org/linux-man/20220722033353.ap7aqxh6uhghdcxo@illithid/
  https://lists.gnu.org/archive/html/groff/2022-07/msg00120.html
  https://lists.gnu.org/archive/html/groff/2022-07/msg00261.html

It seems like all three of us have some reservations about employing
tbl(1) to this end, but we have not aligned on the best alternative.

> Formatters that don't support .ta will just ignore it, so it causes no
> harm, and groff and mandoc do support it.

Exactly what I had in mind for `KS` and `KE`, so that we can have keeps
in man pages.  :D

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]