Re: Spaces in args, escapes, and command substitution

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Spaces in args, escapes, and command substitution

From:	Bob Proulx
Subject:	Re: Spaces in args, escapes, and command substitution
Date:	Sun, 29 Oct 2006 10:20:00 -0700
User-agent:	Mutt/1.5.9i

bash@zacglen.com wrote:
> >This is probably not a an optimal solution because this is late night
> >time for me but this works:
> >
> >  eval vi $(grep -l PATTERN * | sed 's/ /\\ /')
> 
> Yes, that works.
> But surely such a grotesque syntax is not really in the spirit
> of concise unix expressions.

Actually I think it is in the spirit of Unix expressions.  Start with
small components.  Build up more complicated behaviors based upon
smaller behaviors.  I think that would be a very typical example.

> >  grep --null -l PATTERN * | xargs -r0 sed --in-place 's/foo/bar/'
> 
> Again, one really wants to be able to use normal and concise
> expressions.

Again, the basic problem is that the editor you chose to edit those
files is not argument safe.  It does not support zero terminated
strings for filenames read from a file.  That is the root of your
current issue.  If you want to fix something at the root cause then
the editor needs to be fixed.

If the 'seven' editor were an improved version of the 'six' editor
then it might have this.

  vii --null --files=<(grep --null -l PATTERN *)

But to the best of my knowledge no editor currently supports that type
of capability.  Fortunately 'find' can add it to any command.  That is
very much the good philosophy.  Build up more complex behavior using
available building blocks.

It would be reasonable to create a wrapper script that provided this
capability around the existing 'vi' editor.  It is not functionality
that many people have needed.  If it were then I am sure that the
functionality would have been put into it long ago.a

> >Although having spaces in filenames may be common in some cultures it
> >is definitely not in the Unix culture.
> 
> Well, can I also say that unicode and grotesque multi-byte and
> multi-lingual character sets weren't in the unix culture either.
> But that is never an excuse.

I don't get your point.  Unicode et al are only partially supported at
this time.  Everything is moving to UTF-8 at a rapid rate and one day
it will be fully supported.  The big need is there to do so and that
is motivating a lot of people to do the work to make it happen.  But
so far there has not been a similar need to handle spaces and newlines
in filenames and so that has not had the same effort put into it.

> >  rename 's/ /_/g' *
> 
> No! Often filenames must match some scheme, and arbitarily renaming is
> not a good idea.  For example, suppose they are target of html href.
> Suppose they are target of database reference.

Uhm...  That was just making a point.

The point was that filenames with spaces and tabs and newlines are not
a natural fit.  The old joke is that I tell the doctor it hurts when I
do something and the doctor tells me don't do that.  The original
authors thirty years ago did not use the system that way and they
wrote the system for their own use.  If you avoid those problem areas
then you won't have those problems either.

> Excuse me - but really somebody should sit down and think very hard
> about this problem and come up with better working system, rather
> than just excuses.

Don't let me stop you from solving the problem.  If your solution is
better than anything else available then people will use it over the
alternatives.  So far I have not seen any solution that is better than
the existing ones.

> Why is it that word splitting never makes a distinction between
> newlines and space?  Because the output of grep -l, and ls, etc are
> clearly newline delimited.

The underlying filesystem supports all characters except the zero byte
character making it the only safe filename delimiter.  If null
characters had been allowed, such as by length encoding instead, then
we would have a completely different problem today.

> It is bash (and others) which quite deliberately reduce available
> information by converting all newlines and whitespace into a single
> space.

As already meantioned it is configurable using the IFS.  [I am not a
good IFS hacker (yet, I am still learning) and so am not the one to
furnish examples of using IFS, sorry.  But I know how I would solve
the problem in other ways and I have furnished examples.]

But as to why it is because the shell has always done this way back
many years ago.  It would be really nice to have a time machine to be
able to go back in time and to whisper into the author's ear and tell
them to use unicode and to think about arbitrary filenames.  But
practically UTF-8 would not be invented for many years and the whole
idea of their system surviving as long as it has and used by people
who now want spaces and tabs and newlines in filenames would be a
shock to them.  And if I really had a time machine I would probably
use it for other more important things, such as whispering in an ear,
use the Motorola chip. :-)

> Something simple like "vi $(^grep -l xx *)" would do.
> The ^ might work because it denotes line-orientated regex (and nobody
> uses it for pipes any more).
> 
> The ball is quite clearly in bash's court.

I disagree.  I am not opposed to bash adding enhanced syntax to make
this easier.  But why is this a bash problem?  Why not a ksh problem?
Or a zsh problem?  I think this can easily be solved outside of bash
and then it would be portable to every shell.  That is why I favor the
'find' solution.  It will work with shells and programs that have not
yet been invented yet.

Bob

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Spaces in args, escapes, and command substitution, (continued)

Prev by Date: Re: Spaces in args, escapes, and command substitution
Next by Date: Re: `echo "#ls"` ik ok. But not `echo " #ls"`
Previous by thread: Re: Spaces in args, escapes, and command substitution
Next by thread: Re: Spaces in args, escapes, and command substitution
Index(es):
- Date
- Thread