autoconf-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] tests: XFAIL in the face of a MacOS X bug


From: Eric Blake
Subject: [PATCH] tests: XFAIL in the face of a MacOS X bug
Date: Tue, 21 Sep 2010 14:39:14 -0600

* doc/autoconf.texi (Limitations of Usual Tools) <sed>: Mention
the issue.
* tests/torture.at (Substitute and define special characters):
Detect if sed cannot process 8-bit bytes in the C locale.
* THANKS: Update.
Reported by Rochan.

Signed-off-by: Eric Blake <address@hidden>
---

I'm going with this conservative patch prior to the release of 2.68,
of declaring MacOS X as XFAIL on this aspect of AC_SUBST.  I'm not yet
ready to declare that AC_SUBST cannot portably be used with 8-bit
bytes, so for now, I'm hoping that at some point post-release, we can
find a way to coerce some standard tool on that platform to
slice-and-dice arbitrary data in the manner necessary to avoid this
awkward regex behavior in the C locale, possibly by probing alternate
locale names known to be unibyte until we find a working locale.  But
waiting for such a fix would needlessly delay the already-long release
of 2.68, since most configure scripts don't (ab)use AC_SUBST with
8-bit values, so most users won't notice the limitation.

 ChangeLog         |   10 ++++++++++
 THANKS            |    1 +
 doc/autoconf.texi |   22 ++++++++++++++++++++++
 tests/torture.at  |    3 +++
 4 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1b47a2c..311f41f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2010-09-21  Eric Blake  <address@hidden>
+
+       tests: XFAIL in the face of a MacOS X bug
+       * doc/autoconf.texi (Limitations of Usual Tools) <sed>: Mention
+       the issue.
+       * tests/torture.at (Substitute and define special characters):
+       Detect if sed cannot process 8-bit bytes in the C locale.
+       * THANKS: Update.
+       Reported by Rochan.
+
 2010-09-20  Eric Blake  <address@hidden>

        autom4te: don't filter out portions of location traces
diff --git a/THANKS b/THANKS
index cb1589b..4acb36f 100644
--- a/THANKS
+++ b/THANKS
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 6424302..66d8a21 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -18700,6 +18700,28 @@ Limitations of Usual Tools
 not all @command{sed} implementations can handle embedded @code{NUL} or
 a missing trailing newline.

+Remember that ranges within a bracket expression of a regular expression
+are only well-defined in the @samp{C} (or @samp{POSIX}) locale.
+Meanwhile, support for character classes like @samp{[[:upper:]]} is not
+yet universal, so if you cannot guarantee the setting of @env{LC_ALL},
+it is better to spell out a range @samp{[ABCDEFGHIJKLMNOPQRSTUVWXYZ]}
+than to rely on @samp{[A-Z]}.
+
+Additionally, Posix states that regular expressions are only
+well-defined on characters.  Unfortunately, there exist platforms such
+as MacOS X 10.5 where not all 8-bit byte values are valid characters,
+even though that platform has a single-byte @samp{C} locale.  And Posix
+allows the existence of a multi-byte @samp{C} locale, although that does
+not yet appear to be a common implementation.  At any rate, it means
+that not all bytes will be matched by the regular expression @samp{.}:
+
address@hidden
+$ @kbd{printf '\200\n' | LC_ALL=C sed -n /./p | wc -l}
+0
+$ @kbd{printf '\200\n' | LC_ALL=en_US.ISO8859-1 sed -n /./p | wc -l}
+1
address@hidden example
+
 Portable @command{sed} regular expressions should use @samp{\} only to escape
 characters in the string @samp{$()address@hidden@}}.  For example,
 alternation, @samp{\|}, is common but Posix does not require its
diff --git a/tests/torture.at b/tests/torture.at
index 673c7a5..511834d 100644
--- a/tests/torture.at
+++ b/tests/torture.at
@@ -882,6 +882,9 @@ AT_CLEANUP
 AT_SETUP([Substitute and define special characters])
 AT_KEYWORDS([AC@&address@hidden AC@&address@hidden)

+AT_XFAIL_IF([byte=\\200s; dnl
+test `{ printf $byte; echo; } | sed -n '/^./p' | wc -l` = 0])
+
 AT_DATA([Foo.in], address@hidden@
 @bar@@notsubsted@@baz@ stray @ and more@@@baz@
 address@hidden@address@hidden
-- 
1.7.2.3




reply via email to

[Prev in Thread] Current Thread [Next in Thread]