Skip to content

Latest commit

 

History

History
101 lines (77 loc) · 2.58 KB

File metadata and controls

101 lines (77 loc) · 2.58 KB

Python Example

nixl_doca_dma_proxy_example.py demonstrates Host GPU ↔ DPU file transfer via the NIXL Python API using the DOCA_DMA_PROXY backend.

After building both the DPU Agent and the NIXL plugin, this script is the quickest way to verify that the whole pipeline works.

Requirements

  • NIXL Python package installed (pip install nixl[cu12] or nixl[cu13])
  • PyTorch with CUDA support
  • DOCA runtime on the host
  • DPU agent (dpu_dma_copy) running on the BlueField DPU

Environment Setup

The NIXL Python bindings need to locate the DOCA_DMA_PROXY plugin shared library. Set the plugin directory before running the script:

export NIXL_PLUGIN_DIR=/opt/nvidia/nvda_nixl/lib/plugins

If the plugin is installed elsewhere, point NIXL_PLUGIN_DIR to the directory containing libplugin_DOCA_DMA_PROXY.so.

1. Start the DPU Agent

On the BlueField DPU, start dpu_dma_copy. The example below uses TCP control mode (-T) and the POSIX storage backend (-b posix):

./dpu-agent/dpu_dma_copy \
    -p 0000:03:00.0 \
    -m 2048 \
    -q 64 \
    -b posix \
    -w 1 \
    -s 64 \
    -T

Parameters:

  • -p: DPU-side DMA PCI address
  • -m: Staging buffer size in MiB
  • -q: Queue depth
  • -b: Storage backend (posix by default)
  • -w: Number of storage workers
  • -s: Storage chunk size in MiB
  • -T: Use TCP control channel instead of DOCA Comch

2. Run the Python Example

PUSH (Host GPU → DPU)

export NIXL_PLUGIN_DIR=/opt/nvidia/nvda_nixl/lib/plugins

python3 nixl_doca_dma_proxy_example.py \
    -o push \
    -p 0000:ba:00.0 \
    -f /data/test_obj \
    -s 64 \
    -g 0 \
    -d 10.75.70.125 \
    -m tcp

PULL (DPU → Host GPU)

export NIXL_PLUGIN_DIR=/opt/nvidia/nvda_nixl/lib/plugins

python3 nixl_doca_dma_proxy_example.py \
    -o pull \
    -p 0000:ba:00.0 \
    -f /data/test_obj \
    -s 64 \
    -g 0 \
    -d 10.75.70.125 \
    -m tcp \
    -O /tmp/pulled.bin

Common Options

Option Description
-o Operation: push or pull
-p Host-side BlueField PCI address
-f DPU object/file path prefix
-s Segment size in MiB
-g GPU ID
-d DPU IP (TCP mode)
-m Control mode: comch, tcp, or auto
-P TCP port (default: 18517)
-O Output file for pull mode
--batch-size Number of segments per transfer

Notes

  • The example registers GPU memory as VRAM and DPU-resident objects as OBJ.
  • Make sure the memory types match what the DOCA_DMA_PROXY plugin supports.
  • If using COMCH mode, omit -d and -m tcp.