[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: @tieaccent{..} does not display the tie accent in HTML

From: Per Bothner
Subject: Re: @tieaccent{..} does not display the tie accent in HTML
Date: Mon, 30 Aug 2021 10:00:51 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 8/27/21 1:16 PM, Patrice Dumas wrote:
I think that the HTML produced with named entities is much more legible,
and probably more portable.

It may be more legible, but I'm pretty sure it's not more portable.
In fact, it is probably less portable, unless you restrict yourself
to a very minimal set (maybe those from HTML 3.2?), since there may
be version issues - and then you have to be careful which ones
you use and how portable they are.  In which case, what is the point?

The named entities are more mnemonic of course, but looking up the
meaning may well be easier with a hex code than with a name.

And that is just when it comes to strict HTML, not XML/XHTML.

However, having a customization variable to
output only numerical entities would be ok to me, maybe something like


I think more valuable would an "XML_COMPATIBLE" variable.
In addition to numeric entities, it would guarantee to close all tags.
E.g. instead of <br> it would emit <br/> - which also works with
most (all?) HTML parsers.  And possibly other issues.

What I'm looking for is:
(1) Be able to post-process html output with xml tools, such as xslt.
(2) Generate valid epub3 ebooks.

One might want more fine-grained control: Should <?xml?> declaration
be emitted?  What doctype to emit?  What file extension to emit?
However, that level of control is less important as long as the
above 2 goals are met.
However, when it comes to decimal or hex numerical entities I think
hex is preferable, as that is much more common for Unicode values.
I.e. &#xA9; rather than &#169; for ©.

I have no precise idea on that, but the change should only be done if

It's not "needed" - it's just that hex values are used almost universally
for Unicode, and decimal values are rarely used.
        --Per Bothner
per@bothner.com   http://per.bothner.com/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]