[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: non portable sed scripts

From: Ralf Wildenhues
Subject: Re: non portable sed scripts
Date: Mon, 22 May 2006 06:01:47 +0200
User-agent: Mutt/1.5.11+cvs20060403

Hi Paul, Tim,

* Paul Eggert wrote on Sun, May 21, 2006 at 09:46:32AM CEST:
> Ralf Wildenhues <address@hidden> writes:
> > So then the total limit of the script size I found on Solaris (described
> > in that other mail in this thread that was pending for some hours)
> > really is a new issue.
> If it's just Solaris, we should be able to work around it by using
> AC_PROG_SED, as it should check for that bug (it currently doesn't,
> but it should).

Well, my guess is still that (at least one sed on) UnixWare has similar
issues, but I have no idea whether this UnixWare has several sed
implementations, and one of them is usable, or whether it is common to
have GNU sed as add-on installed and in $PATH.

> > All my testing of seds a couple of months ago showed that labels do not
> > count as commands.
> Sorry, I didn't know that.  I guess we can undo my patch then,
> but add a comment.
> Here's a first cut to do this.  I haven't tested it and I assume
> it's not the full job, but it should give you an idea.

Changes w.r.t. to your initial patch (thanks BTW!):

- we need to use a script file to avoid overrunning the command line
  length limit on w32 systems (and also to avoid making the test
  realllly crawl there, even more than it does already),
- we may not use more than 99 commands in the script we test with:
  otherwise, on HP-UX, we won't find any suitable sed.
- remove the script file afterwards, and (try to) unset the long
- document that we may still end up with a limited sed, if we don't find
  any better, but that we error out if we don't find any that works,
- use ${SED-sed} in the config.status code.  Require AC_PROG_SED from
  AC_OUTPUT if needed.  (I don't like to expand it unconditionally, that
  would cause a significant testsuite runtime overhead.)

There are are some caveats to be aware of here: AC_PROG_SED originates
from Libtool.  And there is a history to the Libtool macro.  Libtool
actually plays well with the name space (LT_AC_PROG_SED in 1.5.22, or a
test of m4_defined AC_PROG_SED in CVS Libtool), but if someone ends up
to be smart enough with that to
  if $some_condition; then
  # ...

then you may end up with unset $SED.  This is one reason for using
${SED-sed}, and _not_ initializing $SED in config.status if we have
not called that macro, and not setting the variable in config.status
from within AC_PROG_SED.  Libtool-1.5.10 had a bug in its sed check,
but since we don't use its cache variable $lt_cv_path_SED, we don't
fall prey of that.  OTOH, we better make Libtool-2.0's test match with
what we come up here.

The patch below does not attempt to save us from the segfault of Solaris
/usr/xpg4/bin/sed.  I could try to mimic the script from the torture
test more so that it gets exposed, but I'm not sure it's useful.

Do you think this patch should be applied?  I'm not sure it's worth it.

I think we'd need a full more run of tests on various systems to ensure
this doesn't break down anywhere: it would leave the user with an
unusable package, even if the package in question does not come anywhere
close to the tested limits.  The fact that AC_PROG_SED errors out at
times makes this patch quite scary to me, in fear of trapping the GNU
sed package once it uses Autoconf-2.60.

So IMHO I wouldn't mind if only the bugfix and documentation parts of
the patch below were committed.

Tim, could you do us a favor and test this patch?  It should apply to
Autoconf-2.59c or CVS Autoconf, and the interesting test output would
be in tests/testsuite.log after

  env TESTSUITEFLAGS='-k config.status -k AC_PROG_SED' make -e check


2006-05-21  Paul Eggert  <address@hidden>
        and Ralf Wildenhues  <address@hidden>

        * lib/autoconf/programs.m4 (AC_PROG_SED): Catch script length
        limits in Solaris 8 /usr/ucb/sed by testing a long script.
        (_AC_FEATURE_CHECK_LENGTH): Do not remove `conftest.*', but only
        the files that this macro actually generates, to keep the file
        `conftest.sed' created by AC_PROG_SED.
        * lib/autoconf/status.m4 (_AC_OUTPUT_FILES_PREPARE): Use
        `${SED-sed}' for substitutions at config.status time.
        (_AC_OUTPUT_FILE, _AC_OUTPUT_HEADER): Likewise.
        (_AC_OUTPUT_REQUIRE_SED): New macro.
        (AC_OUTPUT): Call it.
        (_AC_OUTPUT_FILE): Fix typo in comment.
        (_AC_OUTPUT_CONFIG_STATUS): Initialize `$SED' in config.status
        if AC_PROG_SED has been called.
        * doc/autoconf.texi (Particular Programs): Update description of
        (Limitations of Usual Tools) <sed>: Mention script length
        limitations with some of the sed implementations on Solaris.
        * NEWS: Update.

2006-05-21  Paul Eggert  <address@hidden>

        Undo this change, and add comment about not counting labels:

        * lib/autoconf/status.m4 (_AC_OUTPUT_HEADER): Fix off-by-one bug
        that caused config.status to generate 100-command sed scripts;
        the portable limit is 99.

Index: NEWS
RCS file: /cvsroot/autoconf/autoconf/NEWS,v
retrieving revision 1.371
diff -u -r1.371 NEWS
--- NEWS        21 May 2006 00:19:42 -0000      1.371
+++ NEWS        21 May 2006 12:04:02 -0000
@@ -1,5 +1,9 @@
 * Major changes in Autoconf 2.59d
+** The config.status code may now cause AC_PROG_SED to be invoked to search
+  for a Sed with less limitations.  If you need AC_PROG_SED, you should not
+  rely on this, though, and continue to use it.
 ** ac_config_guess, ac_config_sub, ac_configure
   These never-documented variables have been marked with a comment
   saying that we intend to remove them in a future release.
Index: doc/autoconf.texi
RCS file: /cvsroot/autoconf/autoconf/doc/autoconf.texi,v
retrieving revision 1.1018
diff -u -r1.1018 autoconf.texi
--- doc/autoconf.texi   20 May 2006 05:39:03 -0000      1.1018
+++ doc/autoconf.texi   22 May 2006 03:45:47 -0000
@@ -3632,9 +3632,10 @@
 @defmac AC_PROG_SED
 @ovindex SED
-Set output variable @code{SED} to a Sed implementation on @env{PATH} that
-truncates as few characters as possible.  If @sc{gnu} Sed is found,
-use that instead.
+Set output variable @code{SED} to a Sed implementation on
address@hidden that conforms to Posix without arbitrary length limits,
+if possible.  If @sc{gnu} Sed is found, use that instead.  If no
+acceptable Sed is found, an error is reported.
 @end defmac
 @defmac AC_PROG_YACC
@@ -13207,9 +13208,14 @@
 Unicos 9 @command{sed} loops endlessly on patterns like @samp{.*\n.*}.
 Sed scripts should not use branch labels longer than 8 characters and
-should not contain comments.  HP-UX sed has a limit of 99 commands and
+should not contain comments.  HP-UX sed has a limit of 99 commands
+(not counting @samp{:} commands) and
 48 labels, which can not be circumvented by using more than one script
 file.  It can execute up to 19 reads with the @samp{r} command per cycle.
+Solaris @command{/usr/ucb/sed} has a limit of about 6000 bytes for the
+internal representation of the sed script, which can not be circumvented
+using more than one script file.  It fails with longer scripts, and
+and Solaris @command{/usr/xpg4/bin/sed} dumps core with long scripts.
 Avoid redundant @samp{;}, as some @command{sed} implementations, such as
 address@hidden 1.4.2's, incorrectly try to interpret the second
Index: lib/autoconf/programs.m4
RCS file: /cvsroot/autoconf/autoconf/lib/autoconf/programs.m4,v
retrieving revision 1.54
diff -u -r1.54 programs.m4
--- lib/autoconf/programs.m4    19 May 2006 08:11:27 -0000      1.54
+++ lib/autoconf/programs.m4    22 May 2006 03:35:48 -0000
@@ -812,11 +812,20 @@
 # as few characters as possible.  Prefer GNU sed if found.
 [AC_CACHE_CHECK([for a sed that does not truncate output], ac_cv_path_SED,
-    [_AC_PATH_PROG_FEATURE_CHECK(SED, [sed gsed],
+    [dnl ac_script should not contain more than 99 commands (for HP-UX sed),
+     dnl but more than about 7000 bytes, to catch a limit in Solaris 8 
+     for ac_i in 1 2 3 4 5 6 7; do
+       ac_script="$ac_script$as_nl$ac_script"
+     done
+     echo "$ac_script" | sed 99q >conftest.sed
+     $as_unset ac_script || ac_script= 
+     _AC_PATH_PROG_FEATURE_CHECK(SED, [sed gsed],
        [_AC_FEATURE_CHECK_LENGTH([ac_path_SED], [ac_cv_path_SED],
-               ["$ac_path_SED" -e 's/a$//'])])])
+               ["$ac_path_SED" -f conftest.sed])])])
+ AC_SUBST([SED])dnl
+ rm -f conftest.sed
Index: lib/autoconf/status.m4
RCS file: /cvsroot/autoconf/autoconf/lib/autoconf/status.m4,v
retrieving revision 1.103
diff -u -r1.103 status.m4
--- lib/autoconf/status.m4      19 May 2006 21:02:10 -0000      1.103
+++ lib/autoconf/status.m4      21 May 2006 11:46:41 -0000
@@ -303,7 +303,7 @@
 # -----------------
 # Evaluate to an m4 number equal to the maximum number of commands to put
-# in any single sed program.
+# in any single sed program, not counting ":" commands.
 # Some seds have small command number limits, like on Digital OSF/1 and HP-UX.
@@ -353,7 +353,7 @@
 m4_define([_AC_SED_FRAG_NUM], m4_incr(_AC_SED_FRAG_NUM))dnl
 dnl Record that this fragment will need to be used.
-  m4_defn([_AC_SED_CMDS])[| sed -f "$tmp/subs-]_AC_SED_FRAG_NUM[.sed" ])dnl
+  m4_defn([_AC_SED_CMDS])[| ${SED-sed} -f "$tmp/subs-]_AC_SED_FRAG_NUM[.sed" 
 cat >"\$tmp/subs-]_AC_SED_FRAG_NUM[.sed" <<\CEOF
@@ -384,7 +384,7 @@
 [m4_if(_AC_SED_DELIM_NUM, 0,
 [m4_if(_AC_Var, address@hidden@],
 [dnl The whole of the last fragment would be the final deletion of `|#_!!_#|'.
-m4_define([_AC_SED_CMDS], m4_defn([_AC_SED_CMDS])[| sed 's/|#_!!_#|//g' ])],
+m4_define([_AC_SED_CMDS], m4_defn([_AC_SED_CMDS])[| ${SED-sed} 's/|#_!!_#|//g' 
 ac_delim='%!_!# '
 for ac_last_try in false false false false false :; do
@@ -420,12 +420,12 @@
 m4_define([_AC_SED_FRAG_NUM], m4_incr(_AC_SED_FRAG_NUM))dnl
 dnl Record that this fragment will need to be used.
-m4_defn([_AC_SED_CMDS])[| sed -f "$tmp/subs-]_AC_SED_FRAG_NUM[.sed" ])dnl
+m4_defn([_AC_SED_CMDS])[| ${SED-sed} -f "$tmp/subs-]_AC_SED_FRAG_NUM[.sed" 
 cat >"\$tmp/subs-]_AC_SED_FRAG_NUM[.sed" <<\CEOF$ac_eof
-sed '
+${SED-sed} '
 s/[,\\&]/\\&/g; s/@/@|#_!!_#|/g
 s/^/s,@/; s/!/@,|#_!!_#|/
@@ -438,7 +438,7 @@
 ]m4_if(_AC_Var, address@hidden@],
 [m4_if(m4_eval(_AC_SED_CMD_NUM + 2 > _AC_SED_CMD_LIMIT), 1,
-[m4_define([_AC_SED_CMDS], m4_defn([_AC_SED_CMDS])[| sed 's/|#_!!_#|//g' ])],
+[m4_define([_AC_SED_CMDS], m4_defn([_AC_SED_CMDS])[| ${SED-sed} 
's/|#_!!_#|//g' ])],
@@ -480,7 +480,7 @@
 # Do the variable substitutions to create the Makefiles or whatever.
 # This macro is expanded inside a here document.  If the here document is
-# closed, it has to be reopen with "cat >>$CONFIG_STATUS <<\_ACEOF".
+# closed, it has to be reopened with "cat >>$CONFIG_STATUS <<\_ACEOF".
@@ -509,7 +509,7 @@
           [datadir, docdir, infodir, localedir, mandir])
-case `sed -n '/datarootdir/ {
+case `${SED-sed} -n '/datarootdir/ {
@@ -533,7 +533,7 @@
 # Shell code in might set extrasub.
 # FIXME: do we really want to maintain this feature?
-  sed "$ac_vpsub
+  \${SED-sed} "$ac_vpsub
@@ -660,7 +660,7 @@
 [s,^[   #]*u.*,/* & */,]' >>conftest.defines
 # Break up conftest.defines:
-ac_max_sed_lines=m4_eval(_AC_SED_CMD_LIMIT - 4)
+ac_max_sed_lines=m4_eval(_AC_SED_CMD_LIMIT - 3)
 # First sed command is:         sed -f defines.sed $ac_file_inputs >"$tmp/out1"
 # Second one is:        sed -f defines.sed "$tmp/out1" >"$tmp/out2"
@@ -682,7 +682,7 @@
 :def'] >>$CONFIG_STATUS
   sed ${ac_max_sed_lines}q conftest.defines >>$CONFIG_STATUS
   echo 'CEOF
-    sed -f "$tmp/defines.sed"' "$ac_in >$ac_out" >>$CONFIG_STATUS
+    ${SED-sed} -f "$tmp/defines.sed"' "$ac_in >$ac_out" >>$CONFIG_STATUS
   ac_in=$ac_out; ac_out=$ac_nxt; ac_nxt=$ac_in
   sed 1,${ac_max_sed_lines}d conftest.defines >conftest.tail
   grep . conftest.tail >/dev/null || break
@@ -1035,8 +1035,10 @@
 # The big finish.
 # Produce config.status, config.h, and links; and configure subdirs.
+# Be sure not to defun this macro, so that the order is kept.
-[dnl Dispatch the extra arguments to their native macros.
+dnl Dispatch the extra arguments to their native macros.
@@ -1089,6 +1091,19 @@
+# ----------------------
+# Helper macro so we can require AC_PROG_SED but do not need to
+# m4_defun/AC_DEFUN AC_OUTPUT, so that the order of any other
+# requirement within is kept and not propagated to the beginning
+# of AC_OUTPUT.
+  [m4_require([AC_PROG_SED])],
+  [m4_ifdef([_AC_SEEN_CONFIG(HEADERS)],
+    [m4_require([AC_PROG_SED])])])dnl
 # ------------------------
@@ -1219,6 +1234,9 @@

reply via email to

[Prev in Thread] Current Thread [Next in Thread]