texinfo-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]


From: Patrice Dumas
Date: Sun, 14 Jan 2024 18:23:52 -0500 (EST)

branch: master
commit 3fd512eec08a49fc853d0dbb31b8e02e3016758e
Author: Patrice Dumas <pertusus@free.fr>
AuthorDate: Mon Jan 15 00:23:53 2024 +0100

    * tp/Texinfo/XS/parsetexi/end_line.c (end_line_misc_line): map gb2312
    to euc-cn to get the same output as with perl Encode mime_name.  Note
    that this mapping looks wrong, as GB2312 seems to be the preferred
    mime name in IANA encoding registry
    https://www.iana.org/assignments/character-sets/character-sets.xhtml
    but we still do it to match with the Perl output.
---
 ChangeLog                          | 9 +++++++++
 tp/Texinfo/XS/parsetexi/end_line.c | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index a84729c96e..f2eb2a718f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2024-01-14  Patrice Dumas  <pertusus@free.fr>
+
+       * tp/Texinfo/XS/parsetexi/end_line.c (end_line_misc_line): map gb2312
+       to euc-cn to get the same output as with perl Encode mime_name.  Note
+       that this mapping looks wrong, as GB2312 seems to be the preferred
+       mime name in IANA encoding registry
+       https://www.iana.org/assignments/character-sets/character-sets.xhtml
+       but we still do it to match with the Perl output.
+
 2024-01-14  Patrice Dumas  <pertusus@free.fr>
 
        * contrib/mass_test/check_perlVSC.sh: do only the manual given in
diff --git a/tp/Texinfo/XS/parsetexi/end_line.c 
b/tp/Texinfo/XS/parsetexi/end_line.c
index ed61982df7..1361e6b92d 100644
--- a/tp/Texinfo/XS/parsetexi/end_line.c
+++ b/tp/Texinfo/XS/parsetexi/end_line.c
@@ -1378,6 +1378,15 @@ end_line_misc_line (ELEMENT *current)
                           "iso-8859-15", "iso-8859-15",
                           "koi8-r",      "koi8-r",
                           "koi8-u",      "koi8-u",
+             /* For some reason Encode mime_name() for GB2312, a simplified
+                chinese character set encoded as EUC-CN is EUC-CN, while in the
+                IANA character sets assignments, there is no EUC-CN and
+                the Preferred MIME Name of GB2312 is GB2312, see:
+      https://www.iana.org/assignments/character-sets/character-sets.xhtml
+                   Set it the same as Perl here, even though it looks wrong,
+                   just to have the same output.
+                    */
+                          "gb2312",      "euc-cn",
                     };
                     for (i = 0; i < sizeof map / sizeof *map; i++)
                       {



reply via email to

[Prev in Thread] Current Thread [Next in Thread]