bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [striconveh] Error handling and Unicode replacement character


From: Bruno Haible
Subject: Re: [striconveh] Error handling and Unicode replacement character
Date: Sun, 02 Jan 2022 19:58:57 +0100

> 2022-01-01  Bruno Haible  <bruno@clisp.org>
> 
>       striconveh: Support an error handler that produces a Unicode U+FFFD.

The unit test fails on musl libc. This patch fixes it.


2022-01-02  Bruno Haible  <bruno@clisp.org>

        striconveh: Make the last change also work on musl libc.
        * lib/striconveh.c (mem_cd_iconveh_internal): Make the U+FFFD conversion
        also work with non-GNU iconv() implementations.

diff --git a/lib/striconveh.c b/lib/striconveh.c
index 612c38c3e..736482842 100644
--- a/lib/striconveh.c
+++ b/lib/striconveh.c
@@ -847,14 +847,27 @@ mem_cd_iconveh_internal (const char *src, size_t srclen,
                         insize = scratchlen;
                         if (cd2 != (iconv_t)(-1))
                           {
+                            char *out2ptr_try = out2ptr;
+                            size_t out2size_try = out2size;
                             res = iconv (cd2,
                                          (ICONV_CONST char **) &inptr, &insize,
-                                         &out2ptr, &out2size);
+                                         &out2ptr_try, &out2size_try);
                             if (handler == iconveh_replacement_character
-                                && res == (size_t)(-1) && errno == EILSEQ)
+                                && ((res == (size_t)(-1) && errno == EILSEQ)
+                                    /* FreeBSD iconv(), NetBSD iconv(), and
+                                       Solaris 11 iconv() insert a '?' if they
+                                       cannot convert.  This is what we want.
+                                       But IRIX iconv() inserts a NUL byte if 
it
+                                       cannot convert.
+                                       And musl libc iconv() inserts a '*' if 
it
+                                       cannot convert.  */
+                                    || (res > 0
+                                        && !(out2ptr_try - out2ptr == 1
+                                             && *out2ptr == '?'))))
                               {
-                                 /* U+FFFD can't be converted to TO_CODESET.
-                                    Use '?' instead.  */
+                                /* The iconv() call failed.
+                                   U+FFFD can't be converted to TO_CODESET.
+                                   Use '?' instead.  */
                                 scratchbuf[0] = '?';
                                 scratchlen = 1;
                                 inptr = scratchbuf;
@@ -863,6 +876,12 @@ mem_cd_iconveh_internal (const char *src, size_t srclen,
                                              (ICONV_CONST char **) &inptr, 
&insize,
                                              &out2ptr, &out2size);
                               }
+                            else
+                              {
+                                /* Accept the results of the iconv() call.  */
+                                out2ptr = out2ptr_try;
+                                out2size = out2size_try;
+                              }
                           }
                         else
                           {






reply via email to

[Prev in Thread] Current Thread [Next in Thread]