[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sed issue? [was: Subject: [GNU Autoconf 2.67] testsuite: 233 failed]

From: Eric Blake
Subject: Re: sed issue? [was: Subject: [GNU Autoconf 2.67] testsuite: 233 failed]
Date: Tue, 21 Sep 2010 15:50:27 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20100907 Fedora/3.1.3-1.fc13 Mnenhy/0.8.3 Thunderbird/3.1.3

On 09/21/2010 01:49 AM, Paolo Bonzini wrote:
On 09/21/2010 02:37 AM, Eric Blake wrote:

Maybe the sed script in file.sed is non-portable? It's certainly more
complex than the normal run-of-the-mill sed script. Or maybe it is that
the regex '.' has problems matching non-characters, and the definition
of the various locales determine whether 8-bit bytes are characters or
not. Is there any portable way to guarantee a single-byte locale where
'.' matches all possible 8-bit bytes?

More testing shows that 'LC_ALL=en_US.ISO8859-1 sed' on Darwin gives the
desired results, so the problem is definitely a matter of whether the C
locale treats all 256 byte values as potential matches to '.'.

I think that's a (pretty serious) Darwin bug.

The bug is limited to GNU sed, which happened to be first in PATH on the machine where I reproduced the problem (and I'm guessing that the same thing happened to rochan):

$ printf '\200\n' | LC_ALL=C /usr/bin/sed -n /./p | wc -l
$ printf '\200\n' | LC_ALL=C sed -n /./p | wc -l
$ which sed
$ sed --version | head -n1
GNU sed version 4.2

It's nice that the system sed is immune, and I wonder what GNU sed is getting tripped up on? Maybe the autoconf fix is a matter of doing a best-tool search for a sed that handles 8-bit bytes, which would reject this broken GNU sed build in favor of the system sed, even with its other limitations?

Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

reply via email to

[Prev in Thread] Current Thread [Next in Thread]