[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] IBM z/OS + EBCDIC support
From: |
Daniel Richard G. |
Subject: |
[PATCH] IBM z/OS + EBCDIC support |
Date: |
Mon, 21 Sep 2015 22:28:47 -0400 |
Hello list,
The attached patch, against Git master, addresses numerous
incompatibilities in Gnulib with IBM z/OS (a mainframe operating system)
and the EBCDIC encoding.
With my changes, Gnulib builds successfully, and most of the tests
succeed. The remaining failures are as follows.
These appear to expose bugs in the system implementation, and have been
reported to IBM. (A few others have already received APAR fixes):
FAIL: test-fdopendir
FAIL: test-getopt
FAIL: test-mbsrtowcs1.sh
A number of floating-point tests appear to be in the same boat. These
failure modes have yet to be evaluated:
FAIL: test-fma2
FAIL: test-fmaf2
FAIL: test-fmodl-ieee
FAIL: test-isinf
FAIL: test-isnan
FAIL: test-isnanl-nolibm
FAIL: test-isnanl
FAIL: test-ldexpf
FAIL: test-remainderl-ieee
FAIL: test-truncl-ieee
These require more investigation and/or discussion on this list:
FAIL: test-perror.sh
FAIL: test-poll
FAIL: test-select-in.sh
FAIL: test-select-out.sh
FAIL: test-sigpipe.sh
FAIL: test-symlink
FAIL: test-symlinkat
One more issue for now: In order to build Gnulib on this system, it is
necessary to use a compiler wrapper script, due to the inexplicably
broken way xlc handles #include paths. I recently submitted some changes
to Gawk to work around this (look in the feature/zOS-try2 branch,
m4/arch.m4 file; search for "zos-cc"). It's possible that a similar
workaround will need to be bundled here.
In any event, below is a walk-through of my changes in the patch.
Comments and questions are welcome.
+++ lib/alloca.in.h
* z/OS has the alloca() definitions in stdlib.h.
+++ lib/c-ctype.c
* Implementing ctype functions that support EBCDIC from scratch is not
feasible, not least because there isn't even one specific EBCDIC
variant that should be targeted. So I just call through to the system
routines, while ensuring that the compile-time environment is set
correctly, and working around the system routines' input-range issues
with signed chars.
* In EBCDIC, normal chars like 'A' occur in the upper half of the 8-bit
range. This interferes with the idiom of using "switch (c)" and then
"case 'A':" et al. because c can have two distinct values (-63 and
193) that should match to 'A'.
My fix, then, is a macro which converts the input codepoint to the
range that will match literal chars, when necessary. (Obviously, in
ASCII, it's a no-op.) Any takers on a better name for this macro than
CHAR_LITERAL()?
+++ lib/c-ctype.h
* Ensure that ASCII optimizations are applied only when building in
ASCII.
+++ lib/fnmatch.c
* Fixed an error from __GNUC__ not being defined.
+++ lib/get-rusage-as.c
* Added z/OS awareness.
+++ lib/glob.c
* Avoid this #define on z/OS, because...
$ grep alloca /usr/include/stdlib.h
#ifndef alloca
#define alloca(x) __alloca(x)
#pragma linkage(__alloca,builtin)
void *__alloca(unsigned int x);
+++ lib/glthread/thread.c
* Added z/OS awareness. pthread_t does not have a .p field on z/OS, but
this does otherwise seem to apply.
For what it's worth, this is pthread_t, from /usr/include/sys/types.h:
typedef struct {
char __[0x08];
} pthread_t;
+++ lib/glthread/thread.h
* Best guess at a gl_thread implementation for z/OS.
+++ lib/math.in.h
* The system defines these functions as macros, and the compiler did not
like seeing them redefined.
+++ lib/ptsname_r.c
* Likewise.
+++ lib/regex.h
* Ensure that "__string" does not expand to "1" when it is used as a
formal parameter name.
+++ lib/string.in.h
* Likewise.
+++ lib/strtod.c
* The system strtod() sets ERANGE for some reason when parsing "0x".
* It also returns a value of 0.0 for "nan()".
+++ m4/fclose.m4
* This system has a broken fclose(); without this bit, the test-fclose
test fails:
$ ./test-fclose
/path/to/gltests/test-fclose.c:74: assertion 'lseek (fd, 0, SEEK_CUR) == 3'
failed
CEE5207E The signal SIGABRT was received.
ABORT instruction
However, the existing conditions didn't enable it, so I added a
host-platform check.
+++ m4/strstr.m4
* The IBM runtime sucks; signal delivery is delayed until strstr()
exits, so this test results in a hang that can only be SIGKILL'ed.
+++ m4/wchar_h.m4
* The linker on this system cares way too much about the object file's
original name.
Slightly longer explanation: In 64-bit builds, the toolchain uses the
XPLINK object format (as opposed to GOFF for 31-bit builds). XPLINK
has the notion of CSECTs, and these are named. By default, the main
code CSECT is named after the source-file basename. If the linker
encounters two CSECTs with the same name, it will consider them to be
duplicates, and discard one---even if they contain completely
orthogonal definitions.
This can be worked around by specifying the CSECT names explicitly
with -qcsect=foobaz (using different values of "foobaz" for the two
files), but IMO it is easier just to compile the two source files for
these tests from differently-named source files in the first place.
+++ tests/infinity.h
* xlc doesn't like constant div-by-zero expressions.
+++ tests/nan.h
* z/OS, in addition to supporting IEEE floating-point, also supports an
older "hexadecimal" format that does not support NaN. Bomb out if this
is in use.
+++ tests/test-c-ctype.c
* We need the same CHAR_LITERAL() hack here as in c-ctype.c.
+++ tests/test-c-strcasecmp.c
* In EBCDIC-1047, the tests
ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);
are actually
ASSERT (c_strcasecmp ("turkish", "TURKD¬SH") < 0);
ASSERT (c_strcasecmp ("TURKD¬SH", "turkish") > 0);
which, of course, fail.
+++ tests/test-c-strncasecmp.c
* Likewise.
+++ tests/test-canonicalize-lgpl.c
* Addressed a strange z/OS corner case. This system has
DOUBLE_SLASH_IS_DISTINCT_ROOT, yet the dev/ino numbers are the same.
+++ tests/test-iconv-utf.c
* When compiling in (normal) EBCDIC mode on z/OS, the compiler
translates char and string literals to EBCDIC. (Numerical escapes like
"\346" are not remapped.) This messes up the test, because the input
strings are supposed to have their literal characters represented in
ASCII. So I moved all the input strings to the top of the file, added
an appropriate compiler #pragma to change the conversion behavior, and
modified the tests to refer to these.
(Note that a #define would not work for the input strings, because the
text is converted at the point of use, not the point of definition.)
+++ tests/test-iconv.c
* The system iconv implementation does not recognize "ISO-8859-1", but
it does recognize "ISO8859-1".
* Similar issue with converting input strings. (This leaves open the
possibility that any ASSERT() failures will be reported in ISO 8859-1,
not EBCDIC, thus resulting in gibberish on the user's terminal. But I
kept the changes to the minimum needed to get this test to pass. I can
do the full nine yards if desired.)
+++ tests/test-nonblocking-pipe.h
* Added z/OS awareness. (I tested this and found that exact
boundary value; the test fails with 131072.)
+++ tests/test-nonblocking-reader.h
* Nonblocking read() returns EWOULDBLOCK on this system.
+++ tests/test-nonblocking-writer.h
* Nonblocking write() returns EWOULDBLOCK on this system.
+++ tests/test-sigpipe.sh
* Fixed an apparent typo.
+++ tests/test-wcwidth.c
* Only run ASCII-specific tests in ASCII mode.
--Daniel
--
Daniel Richard G. || address@hidden
My ASCII-art .sig got a bad case of Times New Roman.
gnulib-zos-v1.patch
Description: Text Data
- [PATCH] IBM z/OS + EBCDIC support,
Daniel Richard G. <=
Re: [PATCH] IBM z/OS + EBCDIC support, Paul Eggert, 2015/09/22