bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Trailing null fields are discarded while leading ones are preserved


From: Stahlman Family
Subject: Trailing null fields are discarded while leading ones are preserved
Date: Mon, 18 Dec 2006 06:54:06 -0600

Although the section on word splitting in the Bash manual makes no distinction between leading and trailing null fields, Bash's word splitting algorithm will not preserve a trailing null field. The reason for this appears to be the following logic in list_string (much omitted):

for (result = (WORD_LIST *)NULL, sindex = 0; string[sindex]; )
{
   .
   .
   current_word = string_extract_verbatim (string, slen, &sindex, separators);
   .
   .
   ADVANCE_CHAR (string, slen, sindex);
   .
   .
}

If an IFS delimiter is the last char in string, string_extract_verbatim stops on it, ADVANCE_CHAR moves past it, and the for loop is exited before string_extract_verbatim is called for the empty string between the final delimiter and the terminating Nul. The following set of commands illustrates the problem:

$ IFS='|'

# Create a string that should split into 5 fields: leading null, 3 non-null 
interior, and a trailing null
$ a='|a|b|c|'

$ set - $a

$ echo $#
4

$ echo "$1,$2,$3,$4"
,a,b,c

After reading the section on word splitting, I would have expected $# to equal 5 rather than 4, and I would have expected the result of the final echo to be
,a,b,c,
But as you can see, there is a leading null field, but no trailing one. Is this 
by design?

I guess the question is, what is meant by "delimits a field" in the following 
excerpt from the Bash manual?

"Any character in IFS that is not IFS whitespace, along with any adjacent IFS 
whitespace characters, delimits a field."

I suppose I'm interpreting it to mean "separates fields". It occurred to me that perhaps it means "terminates a field". The problem with that definition, however, is that if I add any single character after the final '|' in the example above, string_extract_verbatim will extract a final field, which is not terminated by anything in IFS, but simply by the end of the string. In that case, the final IFS delimiter is separating the final two fields. The bottom line is that since the Bash manual does not appear to distinguish between the cases of leading and trailing null fields, it appears that an arbitrary design choice determines that leading null fields are kept, and trailing ones are not.

Thanks,
Brett S.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]