bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] IBM z/OS + EBCDIC support


From: Daniel Richard G.
Subject: [PATCH] IBM z/OS + EBCDIC support
Date: Mon, 21 Sep 2015 22:28:47 -0400

Hello list,

The attached patch, against Git master, addresses numerous
incompatibilities in Gnulib with IBM z/OS (a mainframe operating system)
and the EBCDIC encoding.

With my changes, Gnulib builds successfully, and most of the tests
succeed. The remaining failures are as follows.

These appear to expose bugs in the system implementation, and have been
reported to IBM. (A few others have already received APAR fixes):

    FAIL: test-fdopendir
    FAIL: test-getopt
    FAIL: test-mbsrtowcs1.sh

A number of floating-point tests appear to be in the same boat. These
failure modes have yet to be evaluated:

    FAIL: test-fma2
    FAIL: test-fmaf2
    FAIL: test-fmodl-ieee
    FAIL: test-isinf
    FAIL: test-isnan
    FAIL: test-isnanl-nolibm
    FAIL: test-isnanl
    FAIL: test-ldexpf
    FAIL: test-remainderl-ieee
    FAIL: test-truncl-ieee

These require more investigation and/or discussion on this list:

    FAIL: test-perror.sh
    FAIL: test-poll
    FAIL: test-select-in.sh
    FAIL: test-select-out.sh
    FAIL: test-sigpipe.sh
    FAIL: test-symlink
    FAIL: test-symlinkat

One more issue for now: In order to build Gnulib on this system, it is
necessary to use a compiler wrapper script, due to the inexplicably
broken way xlc handles #include paths. I recently submitted some changes
to Gawk to work around this (look in the feature/zOS-try2 branch,
m4/arch.m4 file; search for "zos-cc"). It's possible that a similar
workaround will need to be bundled here.


In any event, below is a walk-through of my changes in the patch.
Comments and questions are welcome.


+++ lib/alloca.in.h

* z/OS has the alloca() definitions in stdlib.h.

+++ lib/c-ctype.c

* Implementing ctype functions that support EBCDIC from scratch is not
  feasible, not least because there isn't even one specific EBCDIC
  variant that should be targeted. So I just call through to the system
  routines, while ensuring that the compile-time environment is set
  correctly, and working around the system routines' input-range issues
  with signed chars.

* In EBCDIC, normal chars like 'A' occur in the upper half of the 8-bit
  range. This interferes with the idiom of using "switch (c)" and then
  "case 'A':" et al. because c can have two distinct values (-63 and
  193) that should match to 'A'.

  My fix, then, is a macro which converts the input codepoint to the
  range that will match literal chars, when necessary. (Obviously, in
  ASCII, it's a no-op.) Any takers on a better name for this macro than
  CHAR_LITERAL()?

+++ lib/c-ctype.h

* Ensure that ASCII optimizations are applied only when building in
  ASCII.

+++ lib/fnmatch.c

* Fixed an error from __GNUC__ not being defined.

+++ lib/get-rusage-as.c

* Added z/OS awareness.

+++ lib/glob.c

* Avoid this #define on z/OS, because...

    $ grep alloca /usr/include/stdlib.h
            #ifndef alloca
              #define alloca(x) __alloca(x)
                #pragma linkage(__alloca,builtin)
                void *__alloca(unsigned int x);

+++ lib/glthread/thread.c

* Added z/OS awareness. pthread_t does not have a .p field on z/OS, but
  this does otherwise seem to apply.

  For what it's worth, this is pthread_t, from /usr/include/sys/types.h:

          typedef struct {
                     char __[0x08];
          } pthread_t;

+++ lib/glthread/thread.h

* Best guess at a gl_thread implementation for z/OS.

+++ lib/math.in.h

* The system defines these functions as macros, and the compiler did not
  like seeing them redefined.

+++ lib/ptsname_r.c

* Likewise.

+++ lib/regex.h

* Ensure that "__string" does not expand to "1" when it is used as a
  formal parameter name.

+++ lib/string.in.h

* Likewise.

+++ lib/strtod.c

* The system strtod() sets ERANGE for some reason when parsing "0x".

* It also returns a value of 0.0 for "nan()".

+++ m4/fclose.m4

* This system has a broken fclose(); without this bit, the test-fclose
  test fails:

    $ ./test-fclose
    /path/to/gltests/test-fclose.c:74: assertion 'lseek (fd, 0, SEEK_CUR) == 3' 
failed
    CEE5207E The signal SIGABRT was received.
    ABORT instruction

  However, the existing conditions didn't enable it, so I added a
  host-platform check.

+++ m4/strstr.m4

* The IBM runtime sucks; signal delivery is delayed until strstr()
  exits, so this test results in a hang that can only be SIGKILL'ed.

+++ m4/wchar_h.m4

* The linker on this system cares way too much about the object file's
  original name.

  Slightly longer explanation: In 64-bit builds, the toolchain uses the
  XPLINK object format (as opposed to GOFF for 31-bit builds). XPLINK
  has the notion of CSECTs, and these are named. By default, the main
  code CSECT is named after the source-file basename. If the linker
  encounters two CSECTs with the same name, it will consider them to be
  duplicates, and discard one---even if they contain completely
  orthogonal definitions.

  This can be worked around by specifying the CSECT names explicitly
  with -qcsect=foobaz (using different values of "foobaz" for the two
  files), but IMO it is easier just to compile the two source files for
  these tests from differently-named source files in the first place.

+++ tests/infinity.h

* xlc doesn't like constant div-by-zero expressions.

+++ tests/nan.h

* z/OS, in addition to supporting IEEE floating-point, also supports an
  older "hexadecimal" format that does not support NaN. Bomb out if this
  is in use.

+++ tests/test-c-ctype.c

* We need the same CHAR_LITERAL() hack here as in c-ctype.c.

+++ tests/test-c-strcasecmp.c

* In EBCDIC-1047, the tests

    ASSERT (c_strcasecmp ("turkish", "TURK\304\260SH") < 0);
    ASSERT (c_strcasecmp ("TURK\304\260SH", "turkish") > 0);

  are actually

    ASSERT (c_strcasecmp ("turkish", "TURKD¬SH") < 0);
    ASSERT (c_strcasecmp ("TURKD¬SH", "turkish") > 0);

  which, of course, fail.

+++ tests/test-c-strncasecmp.c

* Likewise.

+++ tests/test-canonicalize-lgpl.c

* Addressed a strange z/OS corner case. This system has
  DOUBLE_SLASH_IS_DISTINCT_ROOT, yet the dev/ino numbers are the same.

+++ tests/test-iconv-utf.c

* When compiling in (normal) EBCDIC mode on z/OS, the compiler
  translates char and string literals to EBCDIC. (Numerical escapes like
  "\346" are not remapped.) This messes up the test, because the input
  strings are supposed to have their literal characters represented in
  ASCII. So I moved all the input strings to the top of the file, added
  an appropriate compiler #pragma to change the conversion behavior, and
  modified the tests to refer to these.

  (Note that a #define would not work for the input strings, because the
  text is converted at the point of use, not the point of definition.)

+++ tests/test-iconv.c

* The system iconv implementation does not recognize "ISO-8859-1", but
  it does recognize "ISO8859-1".

* Similar issue with converting input strings. (This leaves open the
  possibility that any ASSERT() failures will be reported in ISO 8859-1,
  not EBCDIC, thus resulting in gibberish on the user's terminal. But I
  kept the changes to the minimum needed to get this test to pass. I can
  do the full nine yards if desired.)

+++ tests/test-nonblocking-pipe.h

* Added z/OS awareness. (I tested this and found that exact
  boundary value; the test fails with 131072.)

+++ tests/test-nonblocking-reader.h

* Nonblocking read() returns EWOULDBLOCK on this system.

+++ tests/test-nonblocking-writer.h

* Nonblocking write() returns EWOULDBLOCK on this system.

+++ tests/test-sigpipe.sh

* Fixed an apparent typo.

+++ tests/test-wcwidth.c

* Only run ASCII-specific tests in ASCII mode.


--Daniel


-- 
Daniel Richard G. || address@hidden
My ASCII-art .sig got a bad case of Times New Roman.

Attachment: gnulib-zos-v1.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]