# Compile and Install UCX 1.15.0 and OpenMPI 5.0.0: A Comprehensive Guide
# 1. Environment Preparation
First, please ensure that your system meets the following basic requirements:
- Operating System: Supports Linux (such as Ubuntu 20.04 LTS) or other Unix-like operating systems.
- Development Toolkit: Install the necessary build tools and libraries, such as
build-essential
,libnuma-dev
,pkg-config
, etc. - Kernel version: For optimal performance, it is recommended to use the latest stable version of the kernel.
- Hardware environment or virtual environment that requires RDMA support.
sudo apt-get update
sudo apt-get install -y build-essential libnuma-dev pkg-config
# 2. Compile and Install UCX 1.15.0
- Download the UCX source package:
wget https://github.com/openucx/ucx/releases/download/v1.15.0/ucx-1.15.0.tar.gz
tar -xzvf ucx-1.15.0.tar.gz
cd ucx-1.15.0
- Configure UCX compile options:
mkdir build && cd build
../configure --prefix=/root/software/ucx/1.5.0
You can add more configuration options according to actual needs, such as specifying a specific network card type or enabling specific features.
- Compile and install:
make -j 8
make install
- UCX Architecture Description
- The architecture of UCX 1.15.0 is shown in the figure below:
Component | Role | Description |
---|---|---|
UCP | Protocol | Implements advanced abstractions, such as tag matching, streams, connection negotiation and establishment, multi-track, and handling different types of memory. |
UCT | Transport | Implements low-level communication primitives, such as active messages, remote memory access, and atomic operations. |
UCM | Memory | A collection of general data structures, algorithms, and system utilities. |
UCP | Protocol | Intercept memory allocation and release events used by memory registration cache. |
# 3. Compile and Install OpenMPI 5.0.0
- Download the OpenMPI source package:
wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.0.tar.gz
tar -xzvf openmpi-5.0.0.tar.gz
cd openmpi-5.0.0
- Configure OpenMPI compile options, specifying the use of UCX as the transport layer:
mkdir build && cd build
../configure --without-hcoll \
--enable-python-bindings \
--enable-mpirun-prefix-by-default \
--prefix=/root/software/openmpi/5.0.0-ucx-1.15.0 \
--with-ucx=/root/software/ucx/1.15.0 \
--enable-mca-no-build=btl-uct
Note
- For OpenMPI 4.0 and later versions, there may be compilation errors with the
btl_uct
component. This component is not important for using UCX; therefore, it can be disabled with--enable-mca-no-build=btl-uct
:- The
--enable-python-bindings
option is used to enable Python bindings.- The
--enable-mpirun-prefix-by-default
option is used to automatically add the--prefix
option when starting an MPI program withmpirun
.- The
--without-hcoll
option is used to disable the HCOLL component. If not set during compilation, it will report errorscannot find -lnuma
andcannot find -ludev
.
The final configuration options are as follows:
- Compile and install:
make -j 8
make install
# 4. Verify Installation and Set Environment Variables
After installation, you can verify whether UCX and OpenMPI have been successfully integrated by running a simple MPI program:
mpirun -np 2 --mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 hostname
(If running as root, you need to add the --allow-run-as-root
option. If there is an RDMA device, you can set -x UCX_NET_DEVICES
)
Note
If you need to use it with
Slurm
, you can refer to Launching with SlurmOne way is to first allocate resources through
salloc
, and then run thempirun
command on the allocated resources. At this time,--hostfile
,--host
,-n
, etc. do not need to be set, for example:
salloc -n 2
mpirun --mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 hostname
If everything is normal, you will see the output of the two hostnames. For convenience, you can add the OpenMPI bin directory and others to the system PATH environment variable:
vim ~/.bashrc
export PATH=/root/software/openmpi/5.0.0-ucx-1.15.0/bin:$PATH
export LD_LIBRARY_PATH=/root/software/openmpi/5.0.0-ucx-1.15.0/lib:$LD_LIBRARY_PATH
export CPATH=/root/software/openmpi/5.0.0-ucx-1.15.0/include:$CPATH
export MANPATH=/root/software/openmpi/5.0.0-ucx-1.15.0/share/man:$MANPATH
source ~/.bashrc
# 5. UCX Performance Testing
Sender:
ucx_perftest -c 0 -d mlx5_0:1
Recipient:
ucx_perftest -c 1 -d mlx5_0:1 <server_hostname> -t tag_lat
In summary, through the above steps, we have successfully compiled and installed UCX 1.15.0 and OpenMPI 5.0.0 from the source code, and integrated them into an efficient and stable high-performance computing environment. In practical applications, you can further optimize the configuration according to specific needs to achieve better performance.