parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: Graceful Process / Script Termination


From: Nachtigall, Jens (I/EK-Z6)
Subject: AW: Graceful Process / Script Termination
Date: Thu, 17 Sep 2020 15:49:20 +0000

Dear all,

 

so after diving into this another time, I need to correct myself to some extent – yay to that.

Handling trapped signals works nicely both on the local host and remotely (my bad).

I guess I got caught up in the complexity of my own scripts.

 

During more research I came across this particular question by Ole Tange who was facing a similar issue.
https://unix.stackexchange.com/questions/40023/get-ssh-to-forward-signals

 

It is solved as documented here:

https://www.gnu.org/software/parallel/parallel_design.html#The-remote-system-wrapper

 

The only very misleading thing is, that GNU Parallel terminates and returns immediately, regardless.

However, it installed a watchdog remotely to tidy up (thanks for the effort taken here!) but is unable to return feedback to the user as the ssh tunnel has already been killed likewise.

 

For your documentation.

 

All the best and keep up the good work,

Jens

 

---------------------------------------------------------------------------------

Dear all,

 

first of all, thank you for the inspiring work on GNU Parallel. I came across it just very recently and try to integrate the data processing of a public research project with it.

 

For reasons, the processing of a single file of multiple GBs is heavily multithreaded since processing depends on a multitude of tools.

In this particular case we chose to encapsulate the toolchain as a docker swarm which we instantiate multiple times, once per file, as many as necessary.

To leverage the potential of multiple servers for processing we thought of employing GNU Parallel to distribute these jobs.

 

However, I’m facing the issue that using GNU Parallel, I am unable to drive the state machine of the swarms properly.

While calling ‘docker-compose up’ from the wrapper scripts works just fine, termination through ‘docker-compose down’ is impossible.

Instead, GNU Parallel terminates the wrapper script immediately and signals do not get trapped by the script.

 

To isolate the swarm instances from each other we have an extensive resource management wrapped around it.

Hence GNU Parallel cannot call docker-compose directly but depends on the wrapper-script setting up the environment.

 

Telling from the extensive examples given in the manpage and the tutorial, I am aware, that is probably not the most typical use-case for GNU Parallel.

Nonetheless, I’ve tried to replicate the issue with a very simple example below, which yields the same behaviour.

In neither case, the signal handler is called. This applies to both, the local host “:” and remote servers.

 

Any advice on how to have parallel send the INT/TERM signal and subsequently wait for the process to terminate gracefully?

I suspect this is an issue of routing signals. What am I missing here?

 

Version: 20200822

 

All the best,

Jens

 

========================runner.sh===============================

#!/bin/bash

function _do_cleanup() {
    echo "Stopping swarm"
    sleep 5                           # docker-compose down (and releasing exclusive resources).
    exit 0
}

trap _do_ cleanup EXIT      # alternatively SIGINT SIGTERM 
 
echo "Running Docker swarm"
/bin/sleep 100 &             # docker-compose up .... (allocating exclusive resources)
wait                                        #  Waiting, having bash run in fg to actually receive the signal

exit 0

==============================================================

 

Option A) with bash

$> parallel --line-buffer --termseq TERM,10000,INT,10000,KILL,25 ./runner.sh ::: 1

Running Docker swarm

^C

 

Option B) with replacing the foreground process through exec

$> parallel --line-buffer --termseq TERM,10000,INT,10000,KILL,25 exec ./runner.sh ::: 1

Running Docker swarm

^C

 

Option C) haven a bash function exported as the original use case has.

$> function do_runner() { ./runner.sh; }

$> export –f do_runner

$> parallel –env do_runner --line-buffer --termseq TERM,10000,INT,10000,KILL,25 do_runner ::: 1

Running Docker swarm

^C

 

 

 

 

 

 

 

 

INTERNAL


reply via email to

[Prev in Thread] Current Thread [Next in Thread]