mpi4py.run

Added in version 3.0.0.

At import time, mpi4py initializes the MPI execution environment calling MPI_Init_thread() and installs an exit hook to automatically call MPI_Finalize() just before the Python process terminates. Additionally, mpi4py overrides the default ERRORS_ARE_FATAL error handler in favor of ERRORS_RETURN, which allows translating MPI errors in Python exceptions. These departures from standard MPI behavior may be controversial, but are quite convenient within the highly dynamic Python programming environment. Third-party code using mpi4py can just from mpi4py import MPI and perform MPI calls without the tedious initialization/finalization handling. MPI errors, once translated automatically to Python exceptions, can be dealt with the common try…except…finally clauses; unhandled MPI exceptions will print a traceback which helps in locating problems in source code.

Unfortunately, the interplay of automatic MPI finalization and unhandled exceptions may lead to deadlocks. In unattended runs, these deadlocks will drain the battery of your laptop, or burn precious allocation hours in your supercomputing facility.

Exceptions and deadlocks

Consider the following snippet of Python code. Assume this code is stored in a standard Python script file and run with mpiexec in two or more processes.

deadlock.py

from mpi4py import MPI
assert MPI.COMM_WORLD.Get_size() > 1
rank = MPI.COMM_WORLD.Get_rank()
if rank == 0:
    1/0
    MPI.COMM_WORLD.send(None, dest=1, tag=42)
elif rank == 1:
    MPI.COMM_WORLD.recv(source=0, tag=42)

Process 0 raises ZeroDivisionError exception before performing a send call to process 1. As the exception is not handled, the Python interpreter running in process 0 will proceed to exit with non-zero status. However, as mpi4py installed a finalizer hook to call MPI_Finalize() before exit, process 0 will block waiting for other processes to also enter the MPI_Finalize() call. Meanwhile, process 1 will block waiting for a message to arrive from process 0, thus never reaching to MPI_Finalize(). The whole MPI execution environment is irremediably in a deadlock state.

To alleviate this issue, mpi4py offers a simple, alternative command line execution mechanism based on using the -m flag and implemented with the runpy module. To use this features, Python code should be run passing -m mpi4py in the command line invoking the Python interpreter. In case of unhandled exceptions, the finalizer hook will call MPI_Abort() on the MPI_COMM_WORLD communicator, thus effectively aborting the MPI execution environment.

Warning

When a process is forced to abort, resources (e.g. open files) are not cleaned-up and any registered finalizers (either with the atexit module, the Python C/API function Py_AtExit(), or even the C standard library function atexit()) will not be executed. Thus, aborting execution is an extremely impolite way of ensuring process termination. However, MPI provides no other mechanism to recover from a deadlock state.

Command line

The use of -m mpi4py to execute Python code on the command line resembles that of the Python interpreter.

mpiexec -n numprocs python -m mpi4py pyfile [arg] ...
mpiexec -n numprocs python -m mpi4py -m mod [arg] ...
mpiexec -n numprocs python -m mpi4py -c cmd [arg] ...
mpiexec -n numprocs python -m mpi4py - [arg] ...

<pyfile>: Execute the Python code contained in pyfile, which must be a filesystem path referring to either a Python file, a directory containing a __main__.py file, or a zipfile containing a __main__.py file.

-m <mod>: Search sys.path for the named module mod and execute its contents.

-c <cmd>: Execute the Python code in the cmd string command.

-: Read commands from standard input (sys.stdin).