27
Nov 2014
RE:
HTML output for @itemize and @enumerate commands
VERSION:
makeinfo 5.2 (built from source on Fedora 20 x86_64)
BUG: It is not certain
whether these are bugs or an enhancement request,
so you decide.
Following
up on the previous discussion.
The
HTML output for both @itemize and @enumerate are rudimentary.
Not
only is some of the TexInfo source formatting lost, the
generated
markup does not take advantage of the available HTML
constructs.
These
problems are deep, and not easy to address:
@itemize
(<ul>)
-
@itemize correctly takes
advantage of the HTML defaults for:
@itemize (no argument)
@itemize @bullet
-
For arguments other than
@bullet, the generated HTML looks like:
<ul class=”no-bullet”><li> ... </li> ...
<ul>
-
First, this means that
so long as the no-bullet class is defined, the browser
will render the list without bullets.
-
Second, the bullet
character (whatever was specified in the source) is
embedded inside the line item. I fully understand
this design decision, and I probably would have
done it the same way if I were in a hurry.
-
Third, because the
bullet character is embedded in the line item, second
and subsequent lines of the item appear to be vertically
mis-aligned.
-
For '@itemize @minus'
lists, the info output uses the Unicode minus (U+2212) for
the bullet, while the HTML inserts a plain minus sign '-'
(U+002D). I think it would be better to be consistent and
output: −
-
For '@itemize @w{}'
lists, the 'info' output generates a space character where
the bullet would have been and does not generate an extra
embedded space. This seems to be the correct
implementation.
-
Enhancement
Possibility
I feel that full support for the HTML bullet types: [disc
| circle | square | none] is both necessary and
convenient. We could either hard-code a style in the
converter, OR reference a class definition for each type.
In order to maintain flexibility and to be consistent with
the current converter design, I recommend the class
callout. The actual names for the new classes are up to
you, but the following are the names used in the current
version of the CSS definition file. Parsing logic (most
likely first):
-
if ( argument == none
|| argument == @bullet )
generate: <ul class="disc-bullet"> ...
</ul> OR <ul> ... </ul>
(note: the default bullet type is 'disc')
-
else if ( argument ==
@w{} || (TABLE OF CONTENTS) )
generate: <ul class="no-bullet"> ... </ul>
-
else if ( argument ==
@textdegree || argument == @BCIRCLE(U+26AC) )
generate: <ul class="circle-bullet"> ...
</ul>
-
else
generate: <ul class="square-bullet"> ...
</ul>
(note: for bullet characters not supported by HTML,
default to the third type of HTML bullet)
-
Note that if you decide
to hard-code the bullet style, you should use:
<ul style=“list-style-type:xxx;”> because the <ul
type=”xxx”> construct is deprecated in HTML4 and not
supported by HTML5.
@enumerate
(<ol>)
-
'@enumerate' correctly
takes advantage of the HTML defaults for decimal (1, 2, 3,
...)
@enumerate (no argument)
@enumerate 1
-
For non-decimal
enumerators, the enumerator specified in the source is
lost.
-
HTML supports several
enumeration types, but not all of them have TexInfo
equivalents.
-
I think it's important to
directly support at least the following in the converter:
-
@enumerate (default
<ol> is ok)
-
@enumerate 1 (default
<ol> is ok)
-
@enumerate A
class callout: <ol class= “enum-upper-alpha”>
hard-coded: <ol
style=“list-style-type:upper-alpha;”>
-
@enumerate a
class callout: <ol class= “enum-lower-alpha”>
hard-coded: <ol
style=“list-style-type:lower-alpha;”>
-
Additional enumeration
types that would be desirable:
-
My
idea for supporting additional enumeration types in info and HTML output would look like this:
@enumerate @xxx{n} where 'xxx' is the name of the
enumeration type,
and the optional 'n' would specify the starting value.
HTML expects a decimal start
for all types i.e. <ol
style=”list-style-type:lower-roman” start=“4” yields:
iv.
Here are the types I would recommend:
-
@enumerate
@loweralpha{n}
-
@enumerate
@upperalpha{n}
-
@enumerate
@lowerroman{n}
-
@enumerate
@upperroman{n}
-
@enumerate
@lowergreek{n}
-
@enumerate
@enum_decimal{n} (for completeness)
-
@enumerate
@enum_none (this could
be handled by @itemize instead)
-
HTML supports additional types: decimal-leading-zero, lower-latin,
upper-latin, armenian, georgian. These may be too
much, but I have no data on how often these are used
in the real world.
-
The
currently-available TexInfo @enumerate syntax would
remain unchanged, but the
'@enumerate a' and '@enumerate A' would generate HTML as
above. HTML (without styling) would therefore be
unchanged because the class names would
be undefined.
-
Enumeration
that begins at an arbitrary point in the sequence would
be difficult to encode in the HTML unless you hard-code
the style OR pass in a variable (which I'm not sure is
possible). For instance '@enumerate 7' is allowed in the
info output, but how would you pass the start value
through the HTML converter?
-
Note that if you decide
to hard-code the enumeration type, you should use:
<ol style=“list-style-type:xxx;”> OR
<ol style=“list-style-type:xxx;” start=“n”> (for
starting mid-sequence)
-
Parsing logic (most
likely first)
-
if ( argument ==
(DECIMAL NUMBER) || argument == (NONE) )
info: 1, 2, 3, 4, 5, ... (or start at specified point)
HTML: <ol> ... </ol>
-
else if ( argument
>= 'a' && argument <= 'z' || argument ==
@loweralpha )
info: 'a'-'z' as currently implemented, @loweralpha as
if it were 'a',
or @loweralpha{n} where 'n' is the start point
HTML: <ol class="enum-lower-alpha"> ...
</ol>
-
else if ( argument
>= 'A' && argument <= 'Z' || @upperalpha
)
info: 'A'-'Z' as currently implemented, @upperalpha as
if it were 'A'
or @upperalpha{n} where 'n' is the start point
HTML: <ol class="enum-upper-alpha"> ...
</ol>
-
if ( argument ==
@lowerroman )
info: i, ii, iii, iv, v, ...
or @lowerroman{n} where 'n' is the start point
HTML: <ol class="enum-lower-roman"> ...
</ol>
-
else if ( argument ==
@upperroman )
info: I, II, III, IV, V, ...
or @upperroman{n} where 'n' is the start point
HTML: <ol class="enum-upper-roman"> ...
</ol>
-
else if ( argument ==
@lowergreek )
info: α, β, γ, δ, ε, ...
or @lowergreek{n} where 'n' is the start point
HTML: <ol class="enum-lower-greek"> ...
</ol>
-
else // (default to
decimal)
info: 1, 2, 3, 4, 5, ...
HTML: <ol> ... </ol>
All
of the above is of course just suggestion, but some of it
seems
necessary and/or highly desirable for the future of @itemize
and
@enumerate lists.
PS:
I will post an updated version of the CSS definition file to
my
website this weekend.
Cheers,
Mahlon
|