bug-ncurses
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

UTF-8 multi-byte characters are not displayed properly on Windows consol


From: LIU Hao
Subject: UTF-8 multi-byte characters are not displayed properly on Windows consoles
Date: Thu, 12 Jan 2023 15:30:20 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

Hello folks,

I'm mingw-w64 developer and MSYS2 contributor, and I maintain a GNU nano port to Windows [1]. First of all, thank you for the great work!

Since Windows 10, the Windows console has gained UTF-8 support, which however has to be enabled explicitly in system control panel. After UTF-8 support has been enabled and the UTF-8 code page has been set up with the `chcp 65001` command, all standard C ctype functions can work on UTF-8 strings.

However, when GNU nano attempts to display a UTF-8 string, it is taken bytewise and becomes gibberish. I have created this testcase, for example:

   ```
   #include <ncursesw/ncurses.h>

   int
   main(void)
     {
       initscr();
       addstr("»·");  // hex: C2 BB C2 B7
       refresh();
       getch();
     }
   ```

The commented string literal contains two characters as four bytes. On Linux it is displayed properly, but on a Windows UTF-8 console I get `»·`. How should I fix it?


[1] https://github.com/lhmouse/nano-win


--
Best regards,
LIU Hao

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]