[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: make check fails if no en_US.iso88591 locale
From: |
Mike Gran |
Subject: |
Re: make check fails if no en_US.iso88591 locale |
Date: |
Thu, 10 Sep 2009 05:44:56 -0700 |
On Thu, 2009-09-10 at 12:27 +0200, Ludovic Courtès wrote:
> Hello!
>
> I built today’s ‘master’ on a ppc64 box and there are many
> regexp-related errors and a surprisingly high number of unresolved
> regexp-related tests:
>
> http://autobuild.josefsson.org/guile/log-200909100539539848000.txt
>
> This machine only has the following locales:
>
> C
> en_US.utf8
> POSIX
>
I'm not surprised to see the unresolved, since I'd wrapped a lot of
those tests to throw unresolved if a Latin-1 locale wasn't found. The
errors are a surprise: they indicate that my strategy for wrapping in a
Latin-1 locale isn't correct.
The reason for declaring a Latin-1 locale was to allow
scm_to/from_locale_string to convert a scheme string with values from 0
to 255 to an 8-bit binary C string. The regexp.test runs on arbitrary
binary data which wasn't a problem in guile-1.8 since
scm_to/from_locale_string did no real locale conversion.
I could fix the test by testing only characters 0 to 127 in a C locale
if a Latin-1 locale can't be found. I can also fix the test by using
the 'setbinary' function to force the encodings on stdin and stdout to a
default value that will pass through binary data, instead of calling
'setlocale'. The procedure 'setbinary' was always a hack, and I kind of
want to get rid of it, but, this is why it was created.
I looked in the POSIX spec on Regex for specific advice using 128-255 in
regex in the C locale. I didn't see anything offhand. The spec does
spend a lot of time talking about the interaction between the locale and
regular expressions. I get the impression from the spec that using
regex on 128-255 in the C locale is an unexpected use of regular
expressions.
Thanks,
Mike
Re: make check fails if no en_US.iso88591 locale, Ludovic Courtès, 2009/09/09