Re: Guix and openmpi in a container environment

From: Todor Kondić
Subject: Re: Guix and openmpi in a container environment
Date: Mon, 27 Jan 2020 10:54:59 +0000

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Sunday, 19 January 2020 11:25, Todor Kondić <address@hidden> wrote:

> I am getting mpirun errors when trying to execute a simple
> mpirun -np 1 program
> (where program is e.g. 'ls') command in a container environment.
> The error is usually:
> All nodes which are allocated for this job are already filled.
> which makes no sense, as I am trying this on my workstation (single socket, 
> four cores -- your off-the-shelf i5 cpu) and no scheduling system enabled.
> I set up the container with this command:
> guix environment -C -N --ad-hoc -m default.scm
> where default.scm:
> (use-modules (guix packages))
> (specifications->manifest
> `(;; Utilities
> "less"
> "bash"
> "make"
> "openssh"
> "guile"
> "nano"
> "glibc-locales"
> "gcc-toolchain@7.4.0"
> "gfortran-toolchain@7.4.0"
> "python"
> "openmpi"
> "fftw"
> "fftw-openmpi"
> ,@(map (lambda (x) (package-name x)) %base-packages)))
> Simply installing openmpi (guix package -i openmpi) in my usual Guix profile 
> just works out of the box. So, there has to be some quirk where the openmpi 
> container installation is blind to some settings within the usual environment.

For the environment above,

if the mpirun invocation is changed to provide the hostname

mpirun --host $HOSTNAME:4 -np 4 ls

ls is executed in four processes and the output is four times the contents of 
the current directory as expected.

Of course, ls is not an MPI program. However, testing this elementary fortran 
MPI code,

program testrun2
  use mpi
  implicit none
  integer :: ierr

  call mpi_init(ierr)
  call mpi_finalize(ierr)

end program testrun2

fails with runtime errors on any number of processes.

The compilation line was:
mpif90 test2.f90 -o testrun2

The mpirun command:
mpirun --host $HOSTNAME:4 -np 4

Let me reiterate, there is no need to declare the host and its maximal number 
of slots in the normal user environment. Also, the runtime errors are gone.

Could it be that the openmpi package needs a few other basic dependencies not 
present in the package declaration for the particular case of a single node 
(normal PC) machine?

Also, I noted that gfortran/mpif90 ignores "CPATH" and "LIBRARY_PATH" env 
variables. I had to specify this explicitly via -I and -L flags to the compiler.

