bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] regex: fix ignore-case Turkish bug


From: Bruno Haible
Subject: Re: [PATCH 2/2] regex: fix ignore-case Turkish bug
Date: Sat, 26 Sep 2020 15:57:55 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-189-generic; KDE/5.18.0; x86_64; ; )

Paul Eggert wrote:
> +  if (setlocale (LC_ALL, "tr_TR.UTF-8") && really_utf8 ())
> +    {
> +      re_set_syntax (RE_SYNTAX_GREP | RE_ICASE);
> +      if (re_compile_pattern ("i", 1, &regex))
> +        result |= 1;
> +      else
> +        {
> +          /* UTF-8 encoding of U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE.
> +             In Turkish, this is the upper-case equivalent of ASCII "i".
> +             Older versions of Gnulib failed to match "i" to U+0130 when
> +             ignoring case in Turkish <https://bugs.gnu.org/43577>.  */
> +          static char const data[] = "\xc4\xb0";
> +
> +          memset (&regs, 0, sizeof regs);
> +          if (re_search (&regex, data, sizeof data - 1, 0, sizeof data - 1,
> +                         &regs))
> +            result |= 1;
> +          regfree (&regex);
> +          free (regs.start);
> +          free (regs.end);
> +
> +          if (! setlocale (LC_ALL, "C"))
> +            return 1;
> +        }
> +    }

In this test code, it is possible that the first setlocale() call succeeds
but the second one is not invoked. The effect would be that the following
tests get invoked in the Turkish locale, which may lead to very confusing
failure reports.


2020-09-26  Bruno Haible  <bruno@clisp.org>

        regex-tests: Make test more robust.
        * tests/test-regex.c (main): Make sure to revert the locale to "C" after
        the test in "tr_TR.UTF-8" locale. Exit if we can't revert it.

(diff -w)
diff --git a/tests/test-regex.c b/tests/test-regex.c
index a54f643..3a3d8f1 100644
--- a/tests/test-regex.c
+++ b/tests/test-regex.c
@@ -139,11 +139,15 @@ main (void)
         }
 
       if (! setlocale (LC_ALL, "C"))
-        return 1;
+        {
+          report_error ("setlocale \"C\" failed");
+          return exit_status;
+        }
     }
 
-  if (setlocale (LC_ALL, "tr_TR.UTF-8") && really_utf8 ()
-      && towupper (L'i') == 0x0130 /* U+0130; see below.  */)
+  if (setlocale (LC_ALL, "tr_TR.UTF-8"))
+    {
+      if (really_utf8 () && towupper (L'i') == 0x0130 /* U+0130; see below.  
*/)
         {
           re_set_syntax (RE_SYNTAX_GREP | RE_ICASE);
           memset (&regex, 0, sizeof regex);
@@ -168,9 +172,13 @@ main (void)
               regfree (&regex);
               free (regs.start);
               free (regs.end);
+            }
+        }
 
       if (! setlocale (LC_ALL, "C"))
+        {
           report_error ("setlocale \"C\" failed");
+          return exit_status;
         }
     }
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]