bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters


From: Bruno Haible
Subject: Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters
Date: Sun, 11 Jul 2010 15:38:05 +0200
User-agent: KMail/1.9.9

Hi Pádraig,

> +2010-07-07  Pádraig Brady  <address@hidden>
> +
> +       * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() as it's faster

Thanks for the patch. I've applied it as below, with minor changes:
  - Keep around the unoptimized code, for clarity.
  - Add the rationale for the change to the comments, not to the ChangeLog
    entry or git commit message.
  - In the other comment, mention strstr, not memmem, since - as you noticed
    yourself - memmem does not have the appropriate asymptotic behaviour for
    long haystack strings.

> gl_memmem       long        2               1           3.88
> pb_memmem       long        2               1           4.67
> u8_strchr       long        2               1           3.47
> gl_memmem       60          2               10000       1.97
> pb_memmem       60          2               10000       1.97
> u8_strchr       60          2               10000       1.96
> 
> gl_memmem       long        3               1           5.86
> pb_memmem       long        3               1           4.02
> u8_strchr       long        3               1           4.28
> gl_memmem       60          3               10000       1.97
> pb_memmem       60          3               10000       1.97
> u8_strchr       60          3               10000       1.98

I'm not surprised to see that a search loop that inlines and hardcodes
the iteration of the needle (of known length: 2, 3, or 4) is faster than
the more general strstr or memmem.


2010-07-11  Pádraig Brady  <address@hidden>
            Bruno Haible  <address@hidden>

        unistr/u8-strchr: Optimize ASCII argument case.
        * lib/unistr/u8-strchr.c (u8_strchr): For ASCII arguments, use strchr.

--- lib/unistr/u8-strchr.c.orig Sun Jul 11 15:25:30 2010
+++ lib/unistr/u8-strchr.c      Sun Jul 11 15:21:21 2010
@@ -21,6 +21,8 @@
 /* Specification.  */
 #include "unistr.h"
 
+#include <string.h>
+
 uint8_t *
 u8_strchr (const uint8_t *s, ucs4_t uc)
 {
@@ -30,18 +32,31 @@
     {
       uint8_t c0 = uc;
 
-      for (;; s++)
+      if (false)
+        {
+          /* Unoptimized code.  */
+          for (;; s++)
+            {
+              if (*s == c0)
+                break;
+              if (*s == 0)
+                goto notfound;
+            }
+          return (uint8_t *) s;
+        }
+      else
         {
-          if (*s == c0)
-            break;
-          if (*s == 0)
-            goto notfound;
+          /* Optimized code.
+             strchr() is often so well optimized, that it's worth the
+             added function call.  */
+          return (uint8_t *) strchr ((const char *) s, c0);
         }
-      return (uint8_t *) s;
     }
   else
     switch (u8_uctomb_aux (c, uc, 6))
       {
+      /* Loops equivalent to strstr, optimized for a specific length (2, 3, 4)
+         of the needle.  */
       case 2:
         if (*s == 0)
           goto notfound;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]