[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Guix for Corporate "Batch Jobs"?
From: |
Phil |
Subject: |
Re: Guix for Corporate "Batch Jobs"? |
Date: |
Tue, 08 Mar 2022 23:18:50 +0000 |
User-agent: |
mu4e 1.4.15; emacs 27.2 |
Hi Yasu,
Yasuaki Kudo writes:
> Hi,
>
> In many so-called Application Support jobs in the enterprises, one of the
> core responsibilities is to see through the daily completion of "batch jobs"
> - those I/O heavy processes that take a long time to run, even with parallel
> processing.
>
> And at the core of it is to "re-run" the jobs, after due troubleshooting.
>
> In many workplaces I have seen, teams ended up writing their own job
> schedulers based on cron or used proprietary software such as Autosys (and in
> Japan, there are local brews such as A-Auto, if I remember the name
> correctly).
Not sure if this is exactly what you're looking for - but Guix in my
experience can sit at the centre of a tech-stack for providing software
on machines, and then batch-running that software in a very predictable way.
However Guix is currenty first and foremost a command-line tool, so I
find myself augmenting it with other standard offerings to produce
familiar front-ends for triggers, job processing, management, etc.
A few examples below.
I oversee the use of Guix in an enterprise environment. Initially it
was used to build/test our software and also provide deployments with
dependencies etc. We wrapped Guix builds in Jenkins, which in-turn
integrates with our source control to trigger Guix using a standard
branch workflow developers are used to. Guix fetches and caches any
build dependencies making subsequent builds faster, and making artifacts
available via a Guix substitute server to servers across the enterprise.
More recently and probably more useful to you - I've been looking at
taking the build outputs and making them available as batch jobs using
Guix Workflow Language (https://guixwl.org) - which is a good fit if
your batches are compute jobs with well defined inputs, numerous
dependent stages, and the requirement to reproduce identical numerical
output. GWL provides lots of cool features - it's somewhat like Autosys
in that it is declarative - defining dependencies (and thus an order)
between different workflow processes etc. I don't think GWL can memoize
different processes in a workflow tho - so running a workflow several
times results in all workflow processes being run, as far as I know.
The point is you should be guaranteed the same result with the same
inputs, every time.
I tend to wrap the GWL scripts in Rundeck (job scheduler) to allow
less-technical staff to re-run batches through a web app or to construct
a daily schedule for overnight/regression tests etc, rather than use the
guix command line.
Note GWL isn't designed to be used if the aim of your batch jobs is to
have a side-effect on the server you're running on. We only use it to
produce results from calculations. This is different to Autosys where
each job could be entirely made-up of side-effects which change the
state of the server itself.
HTH,
Phil.