bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #57618] man/groff_char.7.man: page needs an overhaul


From: G. Branden Robinson
Subject: [bug #57618] man/groff_char.7.man: page needs an overhaul
Date: Thu, 8 Oct 2020 09:37:52 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0

Update of bug #57618 (project groff):

                Category:                    None => Core                   

    _______________________________________________________

Follow-up Comment #8:

[comment #7 comment #7:]
> It looks like most if not all the problems Branden identified in comment #1
have been addressed.  The page looks great now!

Thanks!  It was quite a bit of work.
 
> But of course I still have complaints.
> 
> (I realize this man page is very much a work in progress, so some of the
below observations may be addressed in changes that have yet to be
committed.)

Nope, at the time you made this comment I was taking a break I felt I had
earned from the page.  :)

> *AGL*
> 
> I still advocate for removing the AGL (formerly PostScript) column.  In
addition to the reasons I previously gave: with this man page now having a
convenient link to the AGL, and with both documents including the Unicode
value for every entry, that value becomes an easy cross-reference for those
very few users who do need to look up an Adobe name.
> 
> If the column is retained, something on the page should tell users what they
can do with this information.
> 
> What are the arguments for moving this information to grops(1) (as proposed
in comment #5) instead of removing it altogether?

I have none.

> *hyphenation-inhibition overkill*
> 
> While no harm arises from putting \% in front of things like "Latin-1" and
"UTF-8," they accomplish nothing: groff already won't break a line at a hyphen
with only numbers on one side of it.

Good point.  I'll fix this.

> *intro text*
> 
> In most subsections, if there is text before the table, it gives an overview
of that section or the data tabulated in the following table.  But a couple of
sections (e.g., "Supplementary Latin letters," "Logical symbols") lead off
with minutiae that might be better placed below the table rather than above
it.

I considered this, but I prefer the consistency of not having trailing text
after _any_ table (in this page).

> *NFD*
> 
> (section "Special character escape forms"): "groff requires NFD
(Normalization Form D)"
> 
> In what sense is this true?

That's a good question.  I pretty much just took Werner's word on faith.  :)

https://lists.gnu.org/archive/html/bug-groff/2020-08/msg00016.html

> Groff doesn't take Unicode input directly at all, but it does take Latin-1,
the bulk of which's character set consists of precomposed characters that in
Unicode would be considered NFC.
> 
> And for Unicode input passed through preconv, groff seems happy with NFC but
not with NFD:
> 
> 
> $ printf "co\xF6perate\n" | uconv -f latin1 -x Any-NFC | groff -Kutf8 >
/dev/null
> $ printf "co\xF6perate\n" | uconv -f latin1 -x Any-NFD | groff -Kutf8 >
/dev/null
> <standard input>:1: warning: can't find special character `u0308'
> $
> 
> 
> groff does handle composed characters given in \[uNNNN_NNNN] form:
> 
> 
> $ echo 'co\[u006F_0308]perate' | groff > /dev/null
> $ 
> 
> 
> but this is not what preconv emits.

This sounds like a bug in (1) preconv for not emitting composite Unicode glyph
escapes and (2) troff for failing to accept them--but that would require more
state tracking.  Maybe we could add a heuristic into troff with a hard-coded
table of combining characters so we can tell the user to put the input
\[uXXXX_YYYY] form.  There's already a big table for detecting invalid UTF-8
sequences, so it could probably fit in there.  I'll see what I think after
actually looking again at the code.

I note regarding point (1) that if we implement my proposal in bug#58796, the
frequency with which such things will have to be emitted will drop off
precipitously.

And (3) I reckon we need to clarify what we mean by "NFD".  We mean
"decomposed and expressed as composite Unicode special character escapes"
(whew!).

Could I trouble you to file these?

> *non-printing characters*
> 
> The text refers to NBSP and SHY as "non-printing characters," but this
doesn't seem to be how they are classified (if ISO_8859-1(7) and Perl's
Unicode library are to be believed):
> 
> 
> $ perl -e '$_ = "\N{NO-BREAK SPACE}\N{SOFT HYPHEN}"; s/[[:print:]]/X/g;
print "$_\n"'
> XX
> $ 
> 

This one's an easier fix.  I meant "non-printing" to _groff_, and I will
update the page accordingly.

> *verbiage*
> 
> (section "Rules and lines"): "Note that both the AGL and the Unicode-derived
names of these three glyphs are rough approximations."
> 
> 0 ...as opposed to precise approximations?
> 0 The phrase "Note that..." communicates nothing.  Just tell the reader the
thing you want her to note, and she will note it.

Yes.  Although you'll find it frequently in my commit messages in a different
sense, as "I added a note to the effect that", but in the imperative mode as
should be done for commit messages and change logs.

But yeah, this is a fair point which applies to lots of groff documentation. 
I'll always think of it as "the Dave Kemper edit" from now on.

> *hyphens*
> 
> "Frequently-used glyphs" should have no hyphen.

Fair.

> "Playing card symbols" should arguably have one.

Disagree here.  Playing card symbols are like ice cream cones.

Thanks for your close review.  I felt shame regarding the old groff_char(7)
man page.  Now it almost looks like something we can be proud of.  :D

Here's the checklist for closing this ticket, then.  Any further problems can
go into a new report.

1. Remove AGL column.
2. Stop suppressing breaks in Latin-1, UTF-8, etc.
3. Note that maximal decomposition isn't necessary if a glyph name special
character escape is used.  Something like that.  You can't tell GNU troff to
"co\xF6perate", but you can tell it to co\[:o]perate or co\[o ad]perate.
4. Clarify context of non-printability WRT NBSP and SHY.
5. Un-"note that" and smooth approximations.
6. "Frequently used".

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?57618>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]