bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode string literals


From: Paul Eggert
Subject: Re: Unicode string literals
Date: Fri, 1 May 2020 14:22:27 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 5/1/20 2:01 AM, Bruno Haible wrote:

> Did you mean (1) that the programmer shall define a macro, that indicates that
> their source code is UTF-8 encoded?
> 
> Or did you mean (2) that gnulib shall define a macro, that shall _assume_ that
> the source code is UTF-8 encoded, and then expand to u8"xyz" instead of "xyz"?

Yes, I meant (2).

> For (2): what's the point? Once you assume that the source code is UTF-8
> encoded, ISO C11 section 6.4.5 says that u8"xyz" and "xyz" are the same:
> literals of type 'char *'.

I was thinking about the case where one develops and normally builds on systems
that assume UTF-8 source code (perhaps because a build system is old and just
compiles the bytes unchecked), but that on occasion a builder might translate
all the source code to (say) EUC-JP for whatever reason, and then compile on a
newer platform that supports the u8 prefix.

Admittedly the scenario is unlikely. I suppose we should wait until a real need
arises before worrying about it.

This all reminds me of trigraphs somehow
<https://en.wikipedia.org/wiki/Digraphs_and_trigraphs>. What a pain that was,
and still is.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]