groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Escaping hyphens ("real" minus signs in groff)


From: G. Branden Robinson
Subject: Re: Escaping hyphens ("real" minus signs in groff)
Date: Fri, 22 Jan 2021 14:56:00 +1100
User-agent: NeoMutt/20180716

Hi Michael!

At 2021-01-21T12:03:13+0100, Michael Kerrisk (man-pages) wrote:
> I appreciate your long answer *very* much. But, I'm glad you started
> with the short answer :-).

Cool!  But beware, from such pressures is the practice of top-replying
born...  ;-)

> > Another issue to consider is that as PDF rendering technology has
> > improved on Linux, it has become possible to copy and paste from PDF
> > documents into a terminal window.  In my opinion we should make this
> > work as well as we can.  Expert Linux users may not ever do this,
> > wondering why anyone would ever try; new Linux users will quite
> > reasonably expect to be able to do it.
[...]
> > And I mean copy-and-paste not just from PDF but from a terminal
> > window.
> 
> Yes, but I have a question: "\-1" renders in PDF as a long dash 
> followed by a "1". This looks okay in PDF, but if I copy and paste
> into a terminal, I don't get an ASCII 45. Seems seems to contradict
> what you are saying about cut-and-paste above. What am I missing?

The gap between aspiration and implementation.  I don't think the
"copy-and-paste from PDF to terminal window" matter is completely sorted
out yet.

I'm a strident prescriptionist about preserving the distinction between
"-" and "\-" in roff documents, notably including man pages in part
because it affords us more room to design around this problem.

ASCII and ISO 8859 unified the hyphen and minus characters.  AT&T troff
and all of its descendants distinguished them.  Unicode also
distinguishes them.  But Unix has a habit of calling ASCII 055 (45
decimal) a "dash", and moreover, to much software, only the numerical
value of the code point is important.

It's quite possible that for man(7) documents rendering to PDF, we
should perform the following mapping (in the man macros).

.if '\*[.T]'pdf' \
.  char \- \N'45'

This didn't come up in my argument with (mostly?) BSD people because (1)
the immediate issue that raised concern had to do with the grave accent
and apostrophe instead and (2) everybody in that camp who spoke up on
the matter said they seldom, if ever, render man pages to PostScript or
PDF.  By that token, the above 2-liner may not be a controversial matter
to the people I was arguing with.  :)

Consider what would happen to the appearance of PDF-rendered man pages
if we encouraged all \- escaped hyphens to be rewritten as plain hyphens
in the source first, and did the following to mandate uniformity.

.if '\*[.T]'pdf' \{\
.  char \- \N'45'
.  char - \N'45'
.\}

...just as is currently done for the 'utf8' output driver, whose second
line I want kill off.

I feel that responsible stewardship of the groff man macro
implementation means considering the needs of diverse audiences.

> I don't really have any other questions, but I have tried to distill 
> the  above into some text in man-pages(7) to remind myself for the
> future:
> 
> [[
> .PP
> The use of real minus signs serves the following purposes:
> .IP * 3
> To provide better renderings on various targets other than
> ASCII terminals,
> notably in PDF and on Unicode/UTF\-8-capable terminals.
> .IP *
> To generate glyphs that when copied from rendered pages will
> produce real minus signs when pasted into a terminal.
> ]]
> 
> Seem okay?

What a "real minus sign" is is a fraught issue[1], but if for the
purposes of man-pages(7) it means the ASCII/ISO hyphen-minus, then yes,
I think it's good enough.

Regards,
Branden

[1] especially in light of the \[mi] special character escape and the
    existence of U+2212 :-/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]