autoconf-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: double-quoted command substitutions


From: Keith MARSHALL
Subject: Re: double-quoted command substitutions
Date: Wed, 22 Feb 2006 14:28:47 +0000

Hi Ralf,

> From: Ralf Wildenhues <address@hidden>
> Date: 2006/02/19 Sun PM 06:55:53 GMT
> To: Keith Marshall <address@hidden>
> CC: Julien Lecomte <address@hidden>, 
address@hidden
> Subject: Re: double-quoted command substitutions
> 
> [ moving to autoconf-patches ]

This nearly slipped under my radar.  I don't subscribe to the
autoconf-patches list, and I don't check my home mailbox, (to which
my sourceforge address is redirected), as frequently as my workplace
mailbox, during the working week.

> * Keith Marshall wrote on Sat, Feb 18, 2006 at 06:04:05PM CET:
> > On Saturday 18 February 2006 3:38 pm, Ralf Wildenhues wrote:
> > > I have tried the MSYS shell now.  I can reproduce the bug 
> > > mentioned in the URLs there, but not on Cygwin, neither with
> > > bash on any other system I could find, nor with /bin/sh on
> > > FreeBSD 5.4, OpenBSD 3.8, AIX 4.3.3 HP-UX 10.20, IRIX 6.5,
> > > Solaris 2.6, Tru64 4.0D.  I don't have access to older systems.
> > 
> > Judging from this, it would appear to be specific to the MSYS
> > bash implementation.  I'll raise a bug tracker for it, and ask
> > Earnie to look into it, when he has time.
> 
> Thanks.
> 
> OK to apply this patch to document the issue in Autoconf?  Should
> we maybe just apply the documentation part?

May be better to hold off on all of it, until we can get a better
handle on the problem.  See later comments I've interposed below,
among relevant fragments of your patch.

> Can we rely on the shell builtins to use one-byte newlines at the
> end of their output?  (I don't know how MSYS has fared in this
> regard.

With msys-1.0.10 on Win2K SP4:

  $ echo foo | od -cb
  0000000   f   o   o  \n
          146 157 157 012
  0000004

> I merely know that with Cygwin this is configurable, so it's
> probably better to play safe.)

I don't think any of the MSYS shell built-ins produce CRLF output,
with the possible exception of `printf`, when you explicitly include
a `\r' in the format, or otherwise, if you pipe the output through a
`unix2dos' type filter.

>   * doc/autoconf.texi (Shell Substitutions): Mention the MSYS
>   shell issue with double-quoted command substitutions of native
>   commands.
>
>   * lib/autoconf/c.m4 (AC_PROG_CC_C_O): Work around the bug as
>   a safety measure, we don't want to rely on a specific EOL
>   representation.
>   * lib/autoconf/programs.m4 (AC_PROG_MAKE_SET): Likewise.
>   * lib/autoconf/status.m4 (_AC_OUTPUT_SUBDIRS): Likewise.
>   * lib/autotest/general.m4 (AT_INIT): Likewise.
> 
> Index: doc/autoconf.texi
> ===================================================================
> RCS file: /cvsroot/autoconf/autoconf/doc/autoconf.texi,v
> retrieving revision 1.949
> diff -u -r1.949 autoconf.texi
> --- doc/autoconf.texi         15 Feb 2006 07:00:29 -0000      1.949
> +++ doc/autoconf.texi         19 Feb 2006 18:42:59 -0000
> @@ -10535,6 +10584,17 @@
>  @noindent
>  The result of @samp{foo=`exit 1`} is left as an exercise to the reader.
> 
> +The MSYS shell leaves a stray byte in the expansion of a double-quoted
> +command substitution of a native program, if the end of the substution
> +is not aligned with the end of the double quote.  ...

This doesn't accurately describe the problem; it's based on an
observation of a *symptom*, but we need to better understand the
underlying *cause* of the problem, before trying to describe it.

> ...  This may be worked
> +around by inserting another pair of quotes:

And, as I come to better understand the *probable* cause, I have
some concern that the suggested work around may not be sufficient,
(although all my experiments so far suggest that it will be).

> address@hidden
> +$ @kbd{echo "`printf 'foo\r\n'` bar" > broken}
> +$ @kbd{echo "`printf 'foo\r\n'`"" bar" | cmp - broken}
> +- broken differ: char 4, line 1
> address@hidden example

However, with this example, you may just have provided the clue to the 
underlying cause of the problem. :)  Please consider:

  $ echo "`printf 'foo\n'` bar" | od -cb
  0000000   f   o   o       b   a   r  \n
          146 157 157 040 142 141 162 012
  0000010

  $ echo "`printf 'foo\r\nbar\r\n'`" | od -cb
  0000000   f   o   o  \r  \n   b   a   r  \n
          146 157 157 015 012 142 141 162 012
  0000011

  $ echo "`printf 'foo\r\nbar\r\n'` baz" | od -cb
  0000000   f   o   o  \r  \n   b   a   r 001       b   a   z  \n
          146 157 157 015 012 142 141 162 001 040 142 141 172 012
  0000016

  $ echo "`printf 'foo\r\nbar\n'`"" baz" | od -cb
  0000000   f   o   o  \r  \n   b   a   r       b   a   z  \n
          146 157 157 015 012 142 141 162 040 142 141 172 012
  0000015

  $ echo "`printf 'foo\r\nbar\n'` baz" | od -cb
  0000000   f   o   o  \r  \n   b   a   r       b   a   z  \n
          146 157 157 015 012 142 141 162 040 142 141 172 012
  0000015

Observe that:

1) The extraneous byte seems always to be SOH.

2) The extraneous byte appears *only* when the output from
   the substituted command is CRLF delimited text, *and* the
   double-quoted expression extends *beyond* the end of the
   command substitution.

3) It is only the terminal CRLF, on the final line of output
   from the substituted command, that participates in the
   generation of the extraneous byte; any other CRLF line
   terminators, appearing earlier in the output, remain
   unchanged in the substitution.

4) If the end of the command substitution is coincident with
   the end of the double-quoted expression, then *both* the
   terminal LF *and* its associated CR are dropped from the
   substituted command output;  OTOH, if the double-quoted
   expression extends beyond the end of the substitution,
   then only the terminal LF is dropped, and its associated
   CR is *replaced* by SOH.

Now, if I repeat these examples on a GNU/Linux box, I see identically
the same results, *except* that CRs are preserved unchanged, in *all*
cases, e.g.

  $ echo "`printf 'foo\r\nbar\r\n'` baz" | od -cb
  0000000   f   o   o  \r  \n   b   a   r  \r       b   a   z  \n
          146 157 157 015 012 142 141 162 015 040 142 141 172 012
  0000016

>From this, I would conclude that the problem lies specifically in the
mishandling of line terminators, when the output from the substituted
command is CRLF delimited text.  Even more specifically, it lies in
the inconsistent handling of the terminal CRLF, depending on whether
the end of the command substitution is coincident with the end of the
enclosing double-quoted expression, or that expression extends beyond
the end of the command substitution, i.e.

a) in the case of coincident ends of substitution and expression,
   the terminal CRLF is discarded, in its entirety.

b) in the case of the double-quoted expression extending beyond
   the end of the command substitution, only the terminal LF is
   discarded, and its associated CR is erroneously replaced by
   a single SOH.

> Index: lib/autoconf/c.m4
> ===================================================================
> RCS file: /cvsroot/autoconf/autoconf/lib/autoconf/c.m4,v
> retrieving revision 1.210
> diff -u -r1.210 c.m4
> --- lib/autoconf/c.m4         24 Jan 2006 00:20:15 -0000      1.210
> +++ lib/autoconf/c.m4         19 Feb 2006 11:00:44 -0000
> @@ -607,7 +607,7 @@
>  fi
>  rm -f conftest*
>  ])dnl
> -if eval "test \"`echo '$ac_cv_prog_cc_'${ac_cc}_c_o`\" = yes"; then
> +if eval "test \"`echo '$ac_cv_prog_cc_'${ac_cc}_c_o`""\" = yes"; then

In MSYS' shell, the built-in `echo' command doesn't produce CRLF delimited 
output, so I don't believe this will be susceptible to the bug.  But in 
any case, I think this would be better written as:

  if eval test \"`echo '$ac_cv_prog_cc_'${ac_cc}_c_o`\" = yes; then

which is clearer, avoids the issue entirely, and conforms better to
this recommendation, which you cited previously:
http://www.in-ulm.de/~mascheck/bourne/par_here_com.html

> Index: lib/autoconf/programs.m4
> ===================================================================
> RCS file: /cvsroot/autoconf/autoconf/lib/autoconf/programs.m4,v
> retrieving revision 1.48
> diff -u -r1.48 programs.m4
> --- lib/autoconf/programs.m4  31 Dec 2005 16:44:22 -0000      1.48
> +++ lib/autoconf/programs.m4  19 Feb 2006 11:00:44 -0000
> @@ -712,7 +712,7 @@
>    eval ac_cv_prog_make_${ac_make}_set=no
>  fi
>  rm -f conftest.make])dnl
> -if eval "test \"`echo '$ac_cv_prog_make_'${ac_make}_set`\" = yes"; then
> +if eval "test \"`echo '$ac_cv_prog_make_'${ac_make}_set`""\" = yes"; 
then

Again, no CRLF output, so no real need for this change, but again,
the outer double quotes are redundant, and could be safely removed,
resulting in a clearer expression:

  if eval test \"`echo '$ac_cv_prog_make_'${ac_make}_set`\" = yes; then

> Index: lib/autoconf/status.m4
> ===================================================================
> RCS file: /cvsroot/autoconf/autoconf/lib/autoconf/status.m4,v
> retrieving revision 1.84
> diff -u -r1.84 status.m4
> --- lib/autoconf/status.m4            6 Jan 2006 00:10:37 -0000 1.84
> +++ lib/autoconf/status.m4            19 Feb 2006 11:00:44 -0000
> @@ -928,7 +928,7 @@
>      # parts of a large source tree are present.
>      test -d $srcdir/$ac_dir || continue
> 
> -    ac_msg="=== configuring in $ac_dir (`pwd`/$ac_dir)"
> +    ac_msg="=== configuring in $ac_dir ("`pwd`"/$ac_dir)"

Again, the built-in `pwd` doesn't produce CRLF delimited output, but
here, I personally prefer the syntax introduced by this patch.

> Index: lib/autotest/general.m4
> ===================================================================
> RCS file: /cvsroot/autoconf/autoconf/lib/autotest/general.m4,v
> retrieving revision 1.196
> diff -u -r1.196 general.m4
> --- lib/autotest/general.m4           11 Jan 2006 08:33:16 -0000 1.196
> +++ lib/autotest/general.m4           19 Feb 2006 11:00:44 -0000
> @@ -351,7 +351,7 @@
>                done
>                at_groups_selected=`echo "$at_groups_selected" | sed 
's/;.*//'`
>                # Smash the newlines.
> -              at_groups="$at_groups`echo $at_groups_selected` "
> +              at_groups=$at_groups`echo $at_groups_selected`" "

And again, no CRLF issue, but I prefer the modified form.

Regards,
Keith.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]