bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-bash] Named fifo's causing hanging bash scripts


From: Jonathan Hankins
Subject: Re: [bug-bash] Named fifo's causing hanging bash scripts
Date: Fri, 16 Jan 2015 10:58:16 -0600

Dr. Fink,

Have you tried getting rid of the stderr redirect on your find command to make sure find isn't showing any errors?

If you eliminate most of the inside of your while loop, does it still hang?  For example:

while IFS="|" read link link_dir link_dest; do
    echo "$link,$link_dir,$link_dest"
done < <(find . -type l -printf '%p|%h|%l\n' 2>/dev/null)

-Jonathan Hankins


On Fri, Jan 16, 2015 at 9:46 AM, Chet Ramey <chet.ramey@case.edu> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/16/15 10:32 AM, Dr. Werner Fink wrote:
> On Fri, Jan 16, 2015 at 09:22:36AM -0500, Chet Ramey wrote:
>> On 1/13/15 4:29 AM, Dr. Werner Fink wrote:
>>
>>>>> Bash Version: 4.3
>>>>> Patch Level: 33
>>>>> Release Status: release
>>>>>
>>>>> Description:
>>>>>         Named fifo's causing hanging bash scripts like
>>>>>
>>>>>         while IFS="|" read a b c ; do
>>>>>           [shell code]
>>>>>         done < <(shell code)
>>>>>
>>>>>         can cause random hangs of the bash.    An strace shows that the bash
>>>>>         stays in wait4()
>>>>
>>>> And when you attach to one of the hanging bash processes using gdb, what
>>>> does the stack traceback look like?
>>>
>>> Yes (and sorry for the wrong email address as this was done on a clean virtual sysstem)
>>>
>>> there are two hanging bash processes together with the find command:
>>>
>>> werner   19062  0.8  0.0  11864  2868 ttyS0    S+   10:21   0:00 bash -x /tmp/brp-25-symlink
>>> werner   19063  0.0  0.0  11860  1920 ttyS0    S+   10:21   0:00 bash -x /tmp/brp-25-symlink
>>> werner   19064  0.2  0.0  16684  2516 ttyS0    S+   10:21   0:00 find . -type l -printf %p|%h|%l n
>>>
>>> the gdb -p 19062 and gdb -p 19063 show
>>>
>>> (gdb) bt
>>> #0  0x00007f530818a65c in waitpid () from /lib64/libc.so.6
>>> #1  0x000000000042b233 in waitchld (block=block@entry=1, wpid=19175) at jobs.c:3235
>>> #2  0x000000000042c6da in wait_for (pid=pid@entry=19175) at jobs.c:2496
>>
>> What do ps and gdb tell you about pid 19175 (and the corresponding pid in
>> the call to waitchld in the other traceback)?  Running, terminated, reaped,
>> other?
>
>   d136:~ # ps 10942
>     PID TTY      STAT   TIME COMMAND
>   d136:~ #
>
> ... the process does not exists anymore. I guess that this could belong to
> the sed commands of the script.

This is why I need to be able to reproduce it.  If the process got reaped,
when would it have happened and why would the call to wait_for() have
found a valid CHILD struct for it?  The whole loop runs with SIGCHLD
blocked, so it's not as if the signal handler could have reaped the
child out from under it.  I have questions but no way to find answers.


- --
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)

iEYEARECAAYFAlS5MjoACgkQu1hp8GTqdKvN5ACeK9XEiIQ1glUHC4hEF3ZTKJjL
dUkAoI6nnxKypXP3MFns6/TyaOHNmHL5
=x3Ck
-----END PGP SIGNATURE-----




--
------------------------------------------------------------------------
Jonathan Hankins    Homewood City Schools

The simplest thought, like the concept of the number one,
has an elaborate logical underpinning. - Carl Sagan

jhankins@homewood.k12.al.us
------------------------------------------------------------------------


reply via email to

[Prev in Thread] Current Thread [Next in Thread]