Re: Unicode string literals

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode string literals

From:	Paul Eggert
Subject:	Re: Unicode string literals
Date:	Fri, 1 May 2020 14:22:27 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 5/1/20 2:01 AM, Bruno Haible wrote:

> Did you mean (1) that the programmer shall define a macro, that indicates that
> their source code is UTF-8 encoded?
> 
> Or did you mean (2) that gnulib shall define a macro, that shall _assume_ that
> the source code is UTF-8 encoded, and then expand to u8"xyz" instead of "xyz"?

Yes, I meant (2).

> For (2): what's the point? Once you assume that the source code is UTF-8
> encoded, ISO C11 section 6.4.5 says that u8"xyz" and "xyz" are the same:
> literals of type 'char *'.

I was thinking about the case where one develops and normally builds on systems
that assume UTF-8 source code (perhaps because a build system is old and just
compiles the bytes unchecked), but that on occasion a builder might translate
all the source code to (say) EUC-JP for whatever reason, and then compile on a
newer platform that supports the u8 prefix.

Admittedly the scenario is unlikely. I suppose we should wait until a real need
arises before worrying about it.

This all reminds me of trigraphs somehow
<https://en.wikipedia.org/wiki/Digraphs_and_trigraphs>. What a pain that was,
and still is.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Unicode string literals, Bruno Haible, 2020/05/01
- Re: Unicode string literals, Paul Eggert <=
  - Re: Unicode string literals, Bruno Haible, 2020/05/01
    - Re: Unicode string literals, Daniel Richard G., 2020/05/01

Prev by Date: Re: xsize and flexmember
Next by Date: Re: Unicode string literals
Previous by thread: Re: Unicode string literals
Next by thread: Re: Unicode string literals
Index(es):
- Date
- Thread