bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

cross manual references in html manuals, second


From: Dumas Patrice
Subject: cross manual references in html manuals, second
Date: Fri, 30 May 2003 19:10:07 +0200
User-agent: Mutt/1.4i

Hi,

New proposal for cross references in html manuals.

Links are basically constructed using the pair (node name, manual).
A link consists in 4 components, an host name, a directories part, a
file name and a target part. The file name and the target are constructed
using the node name. The host and directories are constructed using the
manual name. The host, directories, file and target are to be
used to construct an url (like http://host/directories/file#target).

I first describe how to map the (node name, manual) to these 4 components
in the general case, and then fill the gaps in the case of the software
generating the cross reference (ie the software which tries to refer to
another manual), thereafter called the local software, and the software 
generating the cross reference target manual, called distant software.

expansion of @ commands in node names
-------------------------------------

If the node name contains a @value or a user defined macro (defined with
@macro), they are expanded. Comments are also removed. @if* commands
are also supposed to be allready expanded.

@ commands in node names are not supported in makeinfo or texi2html, thus
it is adviced not to use them. However as they are not ruled out by design
of texinfo and as Karl said that maybe one day it will be supported, they
are included in this proposal for sake of completness.

The following @ commands are not allowed (ie the resulting file name is
unspecified):

@math, @menu, @afourlatex , @afourpaper, @afourwide, @alias, @anchor,
@node, sectionning commands (@headings, @section, @appendix......), @bye,
@center, @centerchap, @?index, @printindex, @*table, @columnfractions,
@contents, @shortcontents, @summarycontents, @cropmarks, @defindex,
@defcodeindexn, all the @deffn like commands, @example and the like,
@enumerate, @itemize, @definfoenclose, @dircategory, @direntry,
@document*, @titlepage, @exampleindent, @*footing, @*heading, @flush*,
@footnotestyle, @group, @include, @item, @itemx, @kbdinputstyle, @raisesections
@lowersections, @macro, @*headings, @math, @need, @pagesizes,
@settitle, @setfilename, @author, @cartouche, @set*contentsaftertitlepage,
@*titlepage, @this*, @title, @titlefont, @unmacro, @rmacro, @vskip,
@verbatiminclude, @copying, @insertcopying, @paragraphindent
and corresponding @end command.

It also seems to me that @verbatim, @verb, @tex and @html should not be
allowed too, but I am not certain about these, however (especially @verb
and @html).

Accented letters are transformed into their 8-bit equivalent character,
according to the iso latin 1 mapping.

The following @ commands are transformed into text. I write the command, and
then what it should be transformed too. 'NOTHING' has a special meaning, it
means that the comand is removed. An 'OR' means that I am not certain which one
is the good one. If the command has braces and there is something in the
braces, the text in the braces is transformed, and then the substituted command
name is followed by a space and the transformed text. For example
'@bullet{a text}' leads to '* a text'. 'SPACE' means a space.

The following @ commands have no real reason to be used in node names, thus
it is recommended to use the plain text equivalent:

@(space) SPACE
@(tab) SPACE
@(newline) SPACE
@* SPACE
@! !
@? ?
@. .
@: NOTHING
@equiv equiv OR ==
@point point OR -!-
@result result OR =>
@expansion expansion OR ==>
@print print OR -|
@error error OR error-->
@exdent NOTHING
@noindent NOTHING
@page NOTHING
@refill NOTHING
@bullet *
@TeX TeX
@today today
@minus -
@copyright (C)
@dots ...
@enddots ....
@exclamdown ! OR 8-bits equivalent
@questiondown ? OR 8-bits equivalent
@pounds pounds OR 8-bits equivalent

The following cannot be avoided easily:
@@ @
@{ {
@} }
@- NOTHING

I don't remember what the following does ;-)
@tie SPACE

For the following @ commands, the @ command and braces are removed and
replaced with the text within argument which is recursively transformed:

@dotless, @acronym, @asis, @b, @command, @cite, @code, @dfn, @dmn, @emph,
@env, @file, @kbd, @key, @samp, @sc, @strong, @t, @var, @url, @w

For @sc letters are capitalized.

The following @ commands shouldn't appear in node names, but, still for
completness, they are considered:

@email is replaced by the text, and if not present the mail adress.
@uref is replaced by the third arg, or the second if not present or the first
@image is replaced by the first arg
@footnote and its argument are removed
@sp and the number following it are removed
@*ref is replaced by the first argument (the node name)

node name expansion
-------------------

@ commands are expanded as above.

multiple spaces and tabs are transformed into just one space.

letters, numbers, and '-', are not modified.

All the characters other than [A-Za-z0-9-] are transformed into _xx where
xx is the ascii code of the character in hexadecimal. _ itself is also
mapped (to _5f). The letters in hexadecimal should be in small caps.

file name and target generation
-------------------------------

To construct the file name, all letters are transformed into the corresponding
small caps letters. '.html' is appended to the resulting file name. The 
reason why the file name is in small caps is because some filesystems are 
case insensitive.

When the node name is any case combination of 'Top' index.html is used
(the local software may also skip the file name as browsers (or servers ?)
use index.html when no file is specified).

With this scheme it may happen that a file name is associated with 2 different
node names. This may happen for different reasons. First 

- if there are @ commands in node name, it may happen that after expansion 
 2 nodes expand to the same name (for example @code{node} and @dfn{node} both
 leads to node). 
- 2 nodes differing only by caps will lead to the same file. 
- a node might be called 'index'.

When no node name is given no file is used (or index.html is assumed).

The target name is simply the expanded node name. The reason why only
[A-Za-z0-9-_] appears in expanded node names is that the targets are
in <a name=> or in id= attributes, and only those characters are allowed
in xhtml. It is still possible to have 2 different nodes expanding to
the same target name, but only because of expansion of @ commands.

construction of host and directories from manual name
-----------------------------------------------------

The manual name should only contain the following characters:
[A-Za-z0-9-_/], / having a special meaning.
If the manual name is absolute, then it is assumed to be a local file.
Otherwise, the manual name is assumed to be a trailing directory component
of the path relative to a given base directory on a given host. This
base directory and this host cannot be further deduced from the manual
name in the general case.

generation of cross reference by the local software
---------------------------------------------------

Given a node name and a manual name what remains to be found is the
base directory and host name. I think that a recommendation could be done,
to follow a file mapping a manual to an host/directory as Karl said. The
location, name and format of this file should be specified as precisely as
possible such that different application can share the same file.
Otherwise the default host/directory could be ../ (ie parent dir on the
localhost) as makeinfo allready does. Of course, the software may override
this default and also what is specified in the file.

generation of reference targets by the distant software
-------------------------------------------------------

The software generating the distant manual should process all node 
and anchor names and generate a file per filename constructed as above. 
In the corresponding file each node and anchor should correspond to an 
<a> html element with name="target" or an id="target" if there are more
than one node associated with that file. Each file should contain the 
node or anchor at that place or redirect to another file (or url) containing 
that node.

For the directory name, it is recommended to use the file name given in
setfilename without .info as directory name, but the software may override
this. That's because the manual name will be mapped to that directory name
by software generating cross references in local manuals.

In the case of multiple nodes with the same target name, the software should
warn the user, and it is only required that the file leads to one of
these nodes. Thus some nodes may not be attainable (but only when there 
are @ commands in node names, leading to the same expansion of the node name,
see above).




This differs from the current makeinfo behavior for targets because % is
replaced by _ to escape characters. It also differs for file names as 
makeinfo uses a simpler scheme (with characters replaced by - and not by
their code and . kept). And the @ commands are treated differently than 
in makeinfo (in makeinfo the texinfo is transformed to html and the html 
is processed). Also makeinfo keeps the caps in file names.

Pat




reply via email to

[Prev in Thread] Current Thread [Next in Thread]