On 4/25/22 1:03 PM, Alexey wrote:
There is one more problem with pipes — they are extremely slow.
It's not pipes per se -- it's the semantics of the shell `read' builtin
and standard input. Profiling or a system call tracer would have
provided
insight.
The shell is very careful not to `steal' input from processes it runs.
This means that
seq 10 | { read one ; cat ; }
will `behead' one line -- and only one line -- from the standard input.
`read' can read ahead, but only under specific conditions. The input
file
descriptor has to be seekable, so the shell can undo any readahead
before
invoking another process. With regular files, this is easy -- lseek
works
with a negative offset. With pipes, which are not seekable, it is not.
So the time it takes to read the command substitution output is
identical
for each run. The time for a single write to a pipe or a temp file is
virtually identical as well.
The difference is the read builtin: when reading from the pipe, you
have to
issue 65536 one-character reads. When reading from a file, since it's
seekable, you don't (bash-5.2 happens to use a 4K buffer, so it issues
17 reads).
The upshot is that this doesn't say all that much about pipes vs. temp
files in general, but it demonstrates that it makes a huge difference
when
you read a single extremely long line using the `read' builtin.