bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IFS field splitting doesn't conform with POSIX


From: Emanuele Torre
Subject: Re: IFS field splitting doesn't conform with POSIX
Date: Thu, 30 Mar 2023 17:52:39 +0200
User-agent: Mutt/2.2.10 (2023-03-25)

On Thu, Mar 30, 2023 at 07:51:59AM -0600, Felipe Contreras wrote:
> But you can't replicate 'a,b' that way, because b does not have a
> terminator. Obviously we'll want 'b' as a field, therefore one has to
> assume either 1) the end of the string is considered an implicit
> terminator, or 2) the terminator in the last field is optional.
> Neither of these two things is specified in POSIX.
> 
> If we consider 1) the end of the string is considered an implicit
> terminator, then 'a' contains a valid field, but then 'a,' contains
> *two* fields. Making these terminators indistinguishable from
> separators.

I repeatedly disputed this interpretation on IRC by saying that your
reasoning to come to this conclusion is that "',' can terminate a field,
and the end of the string can terminate a field, so ',' at the end is
two terminators".

If we extend that reasoning 'a , b' with IFS=' ,' should be split into
four fields because individually ' ', ',', ' ', and the end of string
could all terminate a field.

That is obviously not the case because POSIX clearly says that a field
is terminated by the longest match for either a single non-IFS
whitespace character in IFS, and all the IFS-whitespace characters in
IFS around it if any; or a non-zero-length sequence of IFS-whitespace
characters in IFS. So ' , ' is a single terminator.

You refuse to acknowledge that it does not make sense to claim that a
comma at the of the string MUST yield an empty last field just because a
',' and the "end of string" terminator individually can terminate a
field.

The correct interpretation is that a field is implicitly terminated by
the end of the string if it is not explicitly terminated by a
terminator.
Even though this interpretation being repeatedly proposed to you, you
do not even mention it here as a possible interpretation of the
specification. You still insist that the specification can only possibly
be interpreted in the two ways you mentioned.

How can you say that the current implementation that bash, dash, etc.
use is not compliant to the POSIX specification?

And why do you not acknowledge that the logic on which you base your
claim "',' can terminate a field individually and end-of-string can
terminate a field individually, so two of them in a row must have an
empty field between them, and this negates the possibility that at the
end of the string can be considered a single terminator" is flawed?

If that is not what you are claiming, how do you think that bash's
implementation of field splitting is not compatible with POSIX
definition since you did not mention it as a possible interpretations?

 emanuele6



reply via email to

[Prev in Thread] Current Thread [Next in Thread]