[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Use UTF-8 active code page for Windows host.
From: |
Eli Zaretskii |
Subject: |
Re: [PATCH] Use UTF-8 active code page for Windows host. |
Date: |
Sun, 19 Mar 2023 19:01:55 +0200 |
> From: Costas Argyris <costas.argyris@gmail.com>
> Date: Sun, 19 Mar 2023 16:34:54 +0000
> Cc: bug-make@gnu.org, psmith@gnu.org
>
> > OK, but how is the make.exe you produced built?
>
> I actually did what you suggested but was somewhat confused with the
> result. Usually I do this with 'ldd', but both msvcrt.dll and ucrtbase.dll
> show up in 'ldd make.exe' output, and I wasn't sure what to think of it.
>
> However, your approach with objdump gives fewer results and only
> lists msvcrt.dll, not ucrtbase.dll:
>
> C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:"
> DLL Name: ADVAPI32.dll
> DLL Name: KERNEL32.dll
> DLL Name: msvcrt.dll
> DLL Name: USER32.dll
>
> So I guess MSVCRT is enough, i.e. no need for UCRT.
Yes, thanks.
> > If you try using in a Makefile file names with non-ASCII
> > characters outside of the current ANSI codepage, does Make succeed to
> > recognize files mentioned in the Makefile whose letter-case is
> > different from what is seen in the file system?
>
> I think it does, here is the experiment:
>
> C:\Users\cargyris\temp>ls ❎
> src.c
>
> There is only src.c in that folder.
>
> Makefile utf8.mk is UTF-8 encoded and has this content that
> checks for the existence of:
>
> ❎\src.c
> ❎\src.C
> ❎\src.cs
>
> where ❎ is outside the ANSI codepage (1252).
That's not a good experiment, IMO: the only non-ASCII character here
is U+274E, which has no case variants. And the characters whose
letter-case you tried to change are all ASCII, so their case
conversions are unaffected by the locale.
> If I understand this correctly, both src.c and src.C should be found,
> but not src.cs (just to show a negative case as well).
In addition, I'm not sure Make actually compares file names somewhere,
I think it just calls 'stat', and that is of course case-insensitive
(because the filesystem is on the base level).
My guess would be that only characters within the locale, defined by
the ANSI codepage, are supported by locale-aware functions in the C
runtime. That's because this is what happens even if you use "wide"
Unicode APIs and/or functions like _wcsicmp that accept wchar_t
characters: they all support only the characters of the current locale
set by 'setlocale'. I don't expect that to change just because UTF-8
is used on the outside: internally, everything is converted to UTF-16,
i.e. to the Windows flavor of wchar_t.
> > Btw, there's one aspect where Make on MS-Windows will probably fall
> > short of modern Posix systems: the display of non-ASCII characters on
> > the screen.
>
> Indeed, some thoughts on that:
>
> 1) As you know, this is only affecting the visual aspect of the logs, not the
> inner workings of Make. This could confuse users because they would
> be seeing "errors" on the screen, without there being any real errors.
> Perhaps a mention in the doc or release notes could remedy that.
>
> 2) To some extent (maybe even completely, I don't know) this can be
> mitigated with using PowerShell instead of the classic Command Prompt.
> This seems to be working in this case at least:
This could be just sheer luck: PowerShell uses a font that supports
that particular character. The basic problem here is that "Command
Prompt" windows don't allow to configure more than one font for
displaying characters, and a single font can never support more than a
few scripts. If PowerShell doesn't allow more than a single font in
its windows, it will suffer from the same problem.
> If anything, it could be worth a mention in the doc.
Yes, of course.
- [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/18
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/19
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/19
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/19
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/19
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/19
- Re: [PATCH] Use UTF-8 active code page for Windows host.,
Eli Zaretskii <=
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/19
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Eli Zaretskii, 2023/03/20
- Re: [PATCH] Use UTF-8 active code page for Windows host., Costas Argyris, 2023/03/21