[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: with HTML output, @minus{} is converted to a hyphen instead of a rea

From: Vincent Lefevre
Subject: Re: with HTML output, @minus{} is converted to a hyphen instead of a real minus character
Date: Thu, 13 Oct 2022 02:25:29 +0200
User-agent: Mutt/2.2.7+47 (8681885b) vl-149028 (2022-10-10)

On 2022-10-12 21:36:04 +0200, Patrice Dumas wrote:
> I remember some time ago, probably when latin1 was the default charset,
> that there could be some entities not formatted.  But it was probably
> some time ago.  Note that we want to output HTML that is ok on old
> browser, so we have to be conservative.  We are outputting entities out
> of the charset for some time, though, so this is probably not a concern
> we should have.

In any case, any browser that can be used to access web pages on the
Internet nowadays supports UTF-8 (the issue is not about the charset
itself, but all the needed features that have been implemented after
UTF-8 support). And I doubt that anyone would use a pre-Unicode
browser just to read manuals in HTML.

> > > Opinions?
> > 
> > I agree that UTF-8 should be the default encoding. Then the behavior
> > with old encodings would be less important.
> I was asking the opinions on using − instead of -, not on UTF-8
> being the default encoding, which is effective now...

OK. The question makes sense only when the output is in UTF-8, but
in such a case, I doubt that this really matters. Of course, if
UTF-8 is used, you need to declare it in the HTML file. Then the
only issue could be if the declared encoding ignored. But I'm not
aware of any issue related to that.

Character references such as "−" should be avoided in XHTML
files (in case they are supported as output) because they are not
necessarily loaded by XML parsers.

On 2022-10-12 22:10:31 +0200, Patrice Dumas wrote:
> On Wed, Oct 12, 2022 at 01:59:12PM +0200, Vincent Lefevre wrote:
> > 
> > Is it? I cannot see any change in the NEWS file in master.
> It was actually a change introduced in the 6.7 release.

6.7 mentions input encoding, not output encoding:

  . UTF-8 is the default input encoding

Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]