Featured image of post Compile and Install UCX 1.15.0 and OpenMPI 5.0.0: A Comprehensive Guide

Compile and Install UCX 1.15.0 and OpenMPI 5.0.0: A Comprehensive Guide

Unified Communication X (UCX) and Open Message Passing Interface (OpenMPI) are two indispensable components in the field of high-performance computing. UCX provides an efficient set of low-level communication libraries, optimizing hardware resource utilization, while OpenMPI is a message passing interface widely used in parallel computing tasks. This article will provide detailed guidance on how to compile and install UCX version 1.15.0 from source code and the compatible OpenMPI version 5.0.0.

# Compile and Install UCX 1.15.0 and OpenMPI 5.0.0: A Comprehensive Guide

# 1. Environment Preparation

First, please ensure that your system meets the following basic requirements:

  1. Operating System: Supports Linux (such as Ubuntu 20.04 LTS) or other Unix-like operating systems.
  2. Development Toolkit: Install the necessary build tools and libraries, such as build-essential, libnuma-dev, pkg-config, etc.
  3. Kernel version: For optimal performance, it is recommended to use the latest stable version of the kernel.
  4. Hardware environment or virtual environment that requires RDMA support.
sudo apt-get update
sudo apt-get install -y build-essential libnuma-dev pkg-config

# 2. Compile and Install UCX 1.15.0

  1. Download the UCX source package:
wget https://github.com/openucx/ucx/releases/download/v1.15.0/ucx-1.15.0.tar.gz
tar -xzvf ucx-1.15.0.tar.gz
cd ucx-1.15.0
  1. Configure UCX compile options:
mkdir build && cd build
../configure --prefix=/root/software/ucx/1.5.0

You can add more configuration options according to actual needs, such as specifying a specific network card type or enabling specific features.

  1. Compile and install:
make -j 8
make install
  1. UCX Architecture Description
  • The architecture of UCX 1.15.0 is shown in the figure below:
Architecture-2024-02-03
ComponentRoleDescription
UCPProtocolImplements advanced abstractions, such as tag matching, streams, connection negotiation and establishment, multi-track, and handling different types of memory.
UCTTransportImplements low-level communication primitives, such as active messages, remote memory access, and atomic operations.
UCMMemoryA collection of general data structures, algorithms, and system utilities.
UCPProtocolIntercept memory allocation and release events used by memory registration cache.

# 3. Compile and Install OpenMPI 5.0.0

  1. Download the OpenMPI source package:
wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.0.tar.gz
tar -xzvf openmpi-5.0.0.tar.gz
cd openmpi-5.0.0
  1. Configure OpenMPI compile options, specifying the use of UCX as the transport layer:
mkdir build && cd build
../configure --without-hcoll \
--enable-python-bindings \
--enable-mpirun-prefix-by-default \
--prefix=/root/software/openmpi/5.0.0-ucx-1.15.0  \
--with-ucx=/root/software/ucx/1.15.0 \
--enable-mca-no-build=btl-uct

Note

  • For OpenMPI 4.0 and later versions, there may be compilation errors with the btl_uct component. This component is not important for using UCX; therefore, it can be disabled with --enable-mca-no-build=btl-uct:
  • The --enable-python-bindings option is used to enable Python bindings.
  • The --enable-mpirun-prefix-by-default option is used to automatically add the --prefix option when starting an MPI program with mpirun.
  • The --without-hcoll option is used to disable the HCOLL component. If not set during compilation, it will report errors cannot find -lnuma and cannot find -ludev.

The final configuration options are as follows:

ompi-config-2024-02-03
  1. Compile and install:
make -j 8
make install

# 4. Verify Installation and Set Environment Variables

After installation, you can verify whether UCX and OpenMPI have been successfully integrated by running a simple MPI program:

mpirun -np 2 --mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 hostname

(If running as root, you need to add the --allow-run-as-root option. If there is an RDMA device, you can set -x UCX_NET_DEVICES)

Note

If you need to use it with Slurm, you can refer to Launching with Slurm

One way is to first allocate resources through salloc, and then run the mpirun command on the allocated resources. At this time, --hostfile, --host, -n, etc. do not need to be set, for example:

salloc -n 2
mpirun --mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 hostname

If everything is normal, you will see the output of the two hostnames. For convenience, you can add the OpenMPI bin directory and others to the system PATH environment variable:

vim ~/.bashrc
export PATH=/root/software/openmpi/5.0.0-ucx-1.15.0/bin:$PATH
export LD_LIBRARY_PATH=/root/software/openmpi/5.0.0-ucx-1.15.0/lib:$LD_LIBRARY_PATH
export CPATH=/root/software/openmpi/5.0.0-ucx-1.15.0/include:$CPATH
export MANPATH=/root/software/openmpi/5.0.0-ucx-1.15.0/share/man:$MANPATH
source ~/.bashrc

# 5. UCX Performance Testing

Sender:

ucx_perftest -c 0 -d mlx5_0:1

Recipient:

ucx_perftest -c 1 -d mlx5_0:1 <server_hostname> -t tag_lat

In summary, through the above steps, we have successfully compiled and installed UCX 1.15.0 and OpenMPI 5.0.0 from the source code, and integrated them into an efficient and stable high-performance computing environment. In practical applications, you can further optimize the configuration according to specific needs to achieve better performance.

本博客已稳定运行
总访客数: Loading
总访问量: Loading
发表了 25 篇文章 · 总计 60.67k

Built with Hugo
Theme Stack designed by Jimmy
基于 v3.27.0 分支版本修改