bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Trailing null fields are discarded while leading ones are preserved


From: Stahlman Family
Subject: Re: Trailing null fields are discarded while leading ones are preserved
Date: Wed, 20 Dec 2006 07:03:51 -0600


----- Original Message ----- From: "Chet Ramey" <chet.ramey@case.edu>
To: "Stahlman Family" <brettstahlman@comcast.net>
Cc: <bug-bash@gnu.org>; <chet@case.edu>
Sent: Tuesday, December 19, 2006 9:00 AM
Subject: Re: Trailing null fields are discarded while leading ones are preserved


Stahlman Family wrote:

I guess the question is, what is meant by "delimits a field" in the
following excerpt from the Bash manual?

"Any character in IFS that is not IFS whitespace, along with any
adjacent IFS whitespace characters, delimits a field."

I suppose I'm interpreting it to mean "separates fields". It occurred to
me that perhaps it means "terminates a field". The problem with that
definition, however, is that if I add any single character after the
final '|' in the example above, string_extract_verbatim will extract a
final field, which is not terminated by anything in IFS, but simply by
the end of the string. In that case, the final IFS delimiter is
separating the final two fields. The bottom line is that since the Bash
manual does not appear to distinguish between the cases of leading and
trailing null fields, it appears that an arbitrary design choice
determines that leading null fields are kept, and trailing ones are not.

The Posix committee has debated this issue several times.  In fact, there
is a standards interpretation (from 1995!) declaring that "delimiter"
must be used as "field terminator" (and the standard consistently uses
"delimiter").

Ok. I have found an IEEE interpretation for 1003.2-1992 3.6.5 (interpretation #98) on the web, and I see that the behavior is correct. The thing that wasn't quite clarified by the clarification is the question: "If IFS serves only to terminate fields, then how is it that, if I add any non IFS character after the final field delimiter, a final field is created, which is "delimited" not by anything in IFS, but by the end of the original (unsplit) word?" The only satisfactory answer I could come up with for this is that the final field in that case is not being *created* by word splitting, but merely retained; i.e., the final field is all that is left of the original word as it existed prior to word splitting. All previous fields were created as a result of encountering an IFS delimiter. Thus, the additional fields are sliced off the front of the original word, and you are left either with nothing (if the final char in the original word was an IFS delimiter) or some portion of the original word otherwise. Is this the correct way to look at it?

Thanks,
   Brett S.


The Posix rules are at

http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_05

Bash follows them faithfully.  The language isn't perfect, but there is
practical consensus among shell implementations.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
       Live Strong.  No day but today.
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/






reply via email to

[Prev in Thread] Current Thread [Next in Thread]