bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bad filenames when creating html/docbook using pretest


From: Per Bothner
Subject: Re: bad filenames when creating html/docbook using pretest
Date: Fri, 23 Nov 2012 00:35:00 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121029 Thunderbird/16.0.2

On 11/22/2012 06:30 PM, Patrice Dumas wrote:
On Thu, Nov 22, 2012 at 05:13:34PM -0800, Per Bothner wrote:
For myself, I run a sed script after using texi2any to generate
docbook, before generating html using the docbook style sheets.
So adding an extra rule -e 's|_002d|-|g' is a simple if not
100% robust solution.

There is a hook for the node file names, and a hook to
modify node targets and id.  Function references are called
node_target_name and node_file_name.  But the API is still
experimental.

Thanks - I made a note in the Makefile - maybe I'll try that later.

More critically, anyone using texinfo for non-English text may
want non-Ascii letters in node names.  Perhaps allow any
"NameStartChar" (as in the XML specifications) without escaping.
This makes for much nicer file names and URLs to people using
these languages.  True, %-escaping comes into effect, but this
is handled automatically by browsers and servers.  Let's defer
this to the standard mechanism.

File names, in the default case, have letters transliterated.  Are
you saying that there may be some non ascii characters in docbook ID?
Is it clearly stated somewhere?

I'm fairly sure the restrictions on an id attribute are the standard
restrictions on XML id attributes - i.e. they must follow the syntax of
NCName as specified in the XML specification(s).  (Slightly changed
over various XML versions, but never restricted to Ascii.)  I haven't
found an explicit statement to that effect in the DocBook specification,
but it seems implied that the syntax is an NCName.

ID nodes in HTML4 are restricted to ASCII.  I don't believe that that
restriction applies to XHTML.  In HTML5 the id attribute can be any
non-empty string without spaces:

http://dev.w3.org/html5/html4-differences/#changed-attributes
  The id global attribute is now allowed to have any value, as long as
it is unique, is not the empty string, and does not contain space characters.
--
        --Per Bothner
address@hidden   http://per.bothner.com/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]