[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: patch: set id attribute for part in DocBook
From: |
Per Bothner |
Subject: |
Re: patch: set id attribute for part in DocBook |
Date: |
Mon, 10 Nov 2014 10:29:21 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 |
On 11/10/2014 02:41 AM, Patrice Dumas wrote:
Indeed, id are added when there are nodes associated with sectioning
commands and @part is never associated with nodes.
The id from nodes obeys very strict constraints, explained in the HTML
part of the manual, for example,
a "Th@'e" à.
leads to
a-_0022Th_00e9_0022-_00c3_00a0_002e
Is it ok? It would be consistent with other generated id. It is
possible to have id slightly more readable, for instance what is used
for file names in html, like
a-_0022The_0022-A-_002e
Opinion?
I think these are 3 different questions:
(1) Should DocBook output contain id attributes for @part commands?
IMO, yes.
(2) Should those id attributes be "mangled" using the same algorithm that
id attributes from nodes are?
I don't know of any reason why not.
(3) Should we be less restrictive in what we allow in id attributes?
I think that would be reasonable - though it might break compatibility.
It seems wrong to mangle perfectly-reasonable non-ascii letters.
In principle the id attribute can be any valid XML Name. If we mangle
'à' we're both losing information and making the output uglier. If we
want to restrict filenames, that should be done by the DocBook processor
(or a transformation stage between makeinfo and DocBook). However, all
valid XML Names are valid filenames on modern desktop and server
systems, so such mangling is not needed. Likewise, web servers and
browsers can transparently mangle and demangle non-ascii URLs,
so we join the 21st century, and not deal with it. (Maybe I'm
being overly optimistic ...)
Possible exceptions: XML Names allow '.' and ':' - it might be reasonable
to convert those to '_'.
My conclusion: The goal should be to generate the simplest and most
minimal mangling to produce a valid and human-readable XML Name.
I'm a big believer in "clean URLs". GNU should aim for that.
The same logic would apply to html and xml output, FWIW.
--
--Per Bothner
address@hidden http://per.bothner.com/