bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bash 5.1 heredoc pipes problematic, shopt needed


From: Alexey
Subject: Re: bash 5.1 heredoc pipes problematic, shopt needed
Date: Mon, 25 Apr 2022 21:03:31 +0400
User-agent: Mail UserAgent

On 2022-04-25 17:14, Chet Ramey wrote:
On 4/24/22 4:26 PM, Alexey via Bug reports for the GNU Bourne Again SHell wrote:

My pipe size is 4kb, but...
   ulimit -p
   8

  { file /proc/self/fd/0; } <<<"$(dd if=/dev/urandom bs=1 count=$((4096*16)))"
   /proc/self/fd/0: symbolic link to pipe:[1427240]

  { file /proc/self/fd/0; } <<<"$(dd if=/dev/urandom bs=1 count=$((4096*17)))"
   /proc/self/fd/0: symbolic link to /tmp/sh-thd.Npifok (deleted)


Only from size 65Kb it's became a file. BASH 5.1.16

It's a good question. There's no system call for the kernel to report the `pipe size', which is a fluid concept. There are only a couple of shells
that syntheize a value for it: bash and ksh93. Bash uses the value of
PIPE_BUF, which is defined on POSIX systems as the maximum number of bytes that can be written to a pipe atomically. That's the 4096 number. The pipe capacity is the number of bytes that can be written to the pipe -- by any process -- without being read, before writes block or fail. That's 64K on
Linux.

There is one more problem with pipes — they are extremely slow.

Examples:
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)

  1) Preparation: create two files with ASCII content:
one for to be file, another to be pipe in redirections constructions. rm /tmp/testaP; for i in {1..65535}; do echo -n a >> /tmp/testaP; done rm /tmp/testaF; for i in {1..65536}; do echo -n a >> /tmp/testaF; done

  2) read with here-string:
     a) from auto-pipe:
time for i in {1..100}; do read -r a <<<"$(cat /tmp/testaP)"; done
         real   0m1.224s
     b) from auto-file:
time for i in {1..100}; do read -r a <<<"$(cat /tmp/testaF)"; done
         real   0m0.403s

  2) read from process substitution (forced pipe):
         time for i in {1..100}; do read -r a < <(cat /tmp/testaF); done
         real   0m1.165s

Loop just to see more precise time.

Examples above show that read from tmp file approximately 3 times faster! (example 2.b)
It's very common optimization in my scripts to replace where possible
process substitution to here string construction.
So, if all data below 64K will be piped it will be very slow
There is one exception where it's important to not to get trailing "\n" like here-string do.

Workaround: I can create tmp file myself, but it is increase code complexity: time for i in {1..100}; do tmp=$(mktemp); cat /tmp/testaP > "$tmp"; read -r a < "$tmp"; rm "$tmp"; done
         real   0m0.463s

IMHO: A solution of all pipe problems I see in getting rid of single-byte reading from PIPEs. This can be achieved by creating buffer in a bash which will read all data (one capacity: 64K) from PIPE at once and then scan byte-by-byte this buffer for delimiter. So such buffer can solve PIPE problem with absent SEEK.


Regards,
Alexey.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]