bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] IBM z/OS + EBCDIC support


From: Daniel Richard G.
Subject: Re: [PATCH] IBM z/OS + EBCDIC support
Date: Tue, 22 Sep 2015 19:44:54 -0400

On Tue, 2015 Sep 22 15:03-0700, Paul Eggert wrote:
> Thanks for explaining.  I still see a problem with the proposed patch,
> though, in that (if I'm understanding it correctly) it would cause
> c_isalpha (120) to succeed, even though EBCDIC 120 corresponds to
> U+00CC LATIN CAPITAL LETTER I WITH GRAVE, and that is not supposed to
> be an alphabetic character in the stripped-down C locale.  Code that
> uses c-ctype wants only ASCII letters, and departing from this would
> likely break things.

How would that match occur? c_isalpha() was/is using a "switch"
statement for EBCDIC.

> Worse, the C expression "c_ispunct ('[')" might return false, as the
> library may be in a locale that's incompatible with the mode the
> compiler was in when it compiled the '['.

If the user builds in one locale and runs in another, they're going to
have bigger problems (e.g. garbled program messages). As far as I've
seen, this is considered "out of bounds" in z/OS usage.

> Looking at the web page you mentioned, it appears that one approach is
> to assume EBCDIC 1047 (this seems to be the default and typical
> setting for C programs) at both compile-time and run-time.  We can
> check the compile-time assumption without any code overhead.  The
> proposed patch does that.  If someone ally wants to use a different
> code page, either at compile-time or at run-time, more code will need
> to be written (most likely by the poor soul who actually needs that
> feature).

A different code page at run time, I think, is not feasible. But
international users will at least want a different code page at
compile time.

A simple program could generate tables for all the isxxxxx() functions
(see below) at compile time. Would you be inclined to do it that way?

> > Yes, all control characters appear to be in [\x00-\x3F], but not
> > everything in that range is a control character. (I remember 0x04
> > was not.) I tried making c_iscntrl() a simple range check at first,
> > but that did not agree with the system iscntrl().
>
> Thanks, this should be fixed in the attached patch, which I've
> installed.
> Email had 1 attachment:
> + 0001-c-ctype-assume-EBCDIC-1047-for-c_iscntrl.patch
>   3k (text/x-patch)

I'll try that out. I wasn't expecting you to all but rewrite c-ctype!

Just to help inform the discussion, I've attached a small program that
shows the output of the various isxxxxx() functions for all values in
[0, 255], and its output on z/OS with EBCDIC-1047 and -D_ALL_SOURCE.

It goes to show: where mainframes are concerned, nothing's easy :]


--Daniel


-- 
Daniel Richard G. || address@hidden
My ASCII-art .sig got a bad case of Times New Roman.

Attachment: ctype.c
Description: Text Data

Attachment: ctype-ebcdic1047.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]