LLTFI (Low-Level Tensor Fault Injector) is a unified SWiFI tool that supports fault injection of both C/C++ programs and ML applications written using high-level frameworks such as TensorFlow and PyTorch. Faults are injected at the LLVM IR level, giving precise control over which instructions and registers are targeted.
LLTFI is built on top of LLFI and is fully backward compatible with it.
For a detailed description of the internal architecture — pass pipeline, selector class hierarchy, hardware/software/ML fault modes, runtime library, and the interface between the compile-time and runtime layers — see architecture.md.
Please refer to the following paper for background on LLTFI.
llvm_passes/ LLVM pass plugin (llfi-passes.so) — compile-time only
core/ Pass infrastructure and selector framework
hardware_failures/ Built-in hardware fault instruction selectors
instruction_duplication/ SID pass for ML soft-error detection (SEDPasses.so)
runtime_lib/ C/C++ runtime library linked into instrumented binaries
bin/ Python driver scripts: instrument.py, profile.py, injectfault.py
tools/ Trace analysis, ML utilities
GenerateMakefile/ Test harness Makefile generator
docs/ tutorial_first_experiment.md — end-to-end C/C++ walkthrough and output guide
tutorial_ml_experiment.md — end-to-end ML/ONNX walkthrough (layer targeting, multi-fault)
adding_a_test.md — how to add a regression test case
input_yaml_guide.md — user guide for writing input.yaml
input_masterlist.yaml — full reference schema for input.yaml
test_suite/ Regression tests
sample_programs/ Example C/C++ and ML programs with input.yaml files
architecture.md Internal architecture reference for developers
CODING_GUIDELINES.md C++ and Python style rules
CONTRIBUTING.md How to set up a dev environment and submit changes
migration.md LLVM 15 → 19.x upgrade log
-
64-bit Linux (Ubuntu 20.04 or later) or macOS
-
CMake ≥ 3.15
-
Python 3
-
Python YAML library (PyYAML ≥ 5.4.1)
-
Ninja ≥ 1.10.2
-
Clang and LLVM b270525f730b
To build LLVM from source (required if you also need MLIR for onnx-mlir):
git clone https://github.com/llvm/llvm-project.git cd llvm-project && git checkout b270525f730b && cd .. mkdir llvm-project/build && cd llvm-project/build cmake -G Ninja ../llvm \ -DLLVM_ENABLE_PROJECTS="clang;mlir" \ -DLLVM_BUILD_TESTS=ON \ -DLLVM_TARGETS_TO_BUILD="host" \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_ENABLE_RTTI=ON cmake --build . --target clang check-mlir mlir-translate opt llc lli \ llvm-dis llvm-link -j$(nproc)
-
For ML programs (all optional; tests skip gracefully when absent):
Dependency Install TensorFlow ≥ 2.0 pip install tensorflowtensorflow-onnx pip install tf2onnxPyTorch pip install torchONNX pip install onnxpygraphviz, pydot pip install pygraphviz pydotlibprotoc ≥ 3.11 build from source (see below) ONNX-MLIR (LLTFI branch) see below libprotoc:
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v3.17.2/protobuf-all-3.17.2.zip unzip protobuf-all-3.17.2.zip && cd protobuf-3.17.2 ./configure && make -j$(nproc) && sudo make install && sudo ldconfig
ONNX-MLIR (LLTFI branch, requires an MLIR-enabled LLVM build):
git clone --recursive https://github.com/DependableSystemsLab/onnx-mlir-lltfi.git mv onnx-mlir-lltfi onnx-mlir && cd onnx-mlir && git checkout LLTFI && cd .. MLIR_DIR=$(pwd)/llvm-project/build/lib/cmake/mlir mkdir onnx-mlir/build && cd onnx-mlir/build cmake -G Ninja -DCMAKE_CXX_COMPILER=/usr/bin/c++ -DMLIR_DIR=${MLIR_DIR} .. cmake --build . && ninja install
-
GraphViz (for dependency graph visualisation)
Run ./setup --help for a full option list.
./setup -LLFI_BUILD_ROOT <build-dir> \
-LLVM_SRC_ROOT <llvm-project-dir> \
-LLVM_DST_ROOT <llvm-install-or-build-dir>
The build root must not already exist. Delete it first when rebuilding from scratch. To rebuild after source changes without re-running setup:
cd /path/to/LLTFI-build && makedocker/Dockerfile builds and runs LLTFI in a container. Copy the Dockerfile
outside the repository, then:
docker build --tag lltfi .
docker run -it lltfiSee docker/README.md for details.
Tests must be run from the build directory. Running all regression tests after installation is strongly recommended. Individual test categories can be run separately:
cd <LLFI_BUILD_ROOT>/test_suite
python3 SCRIPTS/llfi_test --all_cpp # 21 core tests (expected: 21/21 PASS)
python3 SCRIPTS/llfi_test --all_hardware_faults # hardware fault injection (8 tests)
python3 SCRIPTS/llfi_test --all_trace_tools_tests # trace analysis tools (3 tests)
python3 SCRIPTS/llfi_test --all_makefile_generation # Makefile generation (2 tests)Error messages during fault injection runs are normal and expected.
python3 SCRIPTS/llfi_test --all_mlTests that require missing dependencies are reported as SKIP (not FAIL) and excluded from the pass/fail count.
| Group | Requirements |
|---|---|
| ML tool unit tests | pip install onnx pygraphviz pydot |
| Instruction duplication (synthetic IR) | LLTFI build only |
| Instruction duplication (real model IR) | model.ll from sample_programs/.../mnist/compile.sh |
| TensorFlow → ONNX | pip install tensorflow tf2onnx onnx |
| PyTorch → ONNX | pip install torch onnx |
| ONNX → LLVM IR | onnx-mlir binary (set ONNX_MLIR_BUILD) |
| Fault injection (ML) | LLTFI build + model.ll |
You can use test programs in the directory sample_programs/ or test_suite/PROGRAMS/ to test LLTFI. Programs in the sample_programs directory already contain a valid input.yaml file.
Example program: factorial:
- Copy the
sample_programs/cpp_sample_programs/factorial/directory to your project directory. - Set the
LLFI_BUILD_ROOTenvironment variable:export LLFI_BUILD_ROOT=/path/to/LLFI-build - Add the LLVM bin directory to your PATH:
export PATH=/path/to/llvm/bin:$PATH - Run:
bash compileAndRun.sh factorial 6
For ML fault injection tests, model.ll must be pre-built by running
compile.sh in sample_programs/ml_sample_programs/vision_models/mnist/
(requires onnx-mlir).
Programs in sample_programs/ already contain a valid input.yaml.
Example — factorial:
- Copy the directory to your working location:
cp -r sample_programs/cpp_sample_programs/factorial/ /tmp/factorial cd /tmp/factorial - Set environment variables:
export LLFI_BUILD_ROOT=/path/to/LLTFI-build export PATH=/path/to/llvm/bin:$PATH
- Compile and run:
bash compileAndRun.sh 6
Output from LLFI is written to the llfi/ directory. See
architecture.md §5.3 for a description of the output files.
After fault injection, output is in the llfi/ directory inside your program
folder. For a full description of each file see
architecture.md — Interface Between the Two Layers.
| Directory | Contents |
|---|---|
std_output/ |
Piped stdout from each run |
llfi_stat_output/ |
Fault injection statistics, profiling data, trace files |
error_output/ |
Failure reports (crashes, hangs, SDCs) |
baseline/ |
Golden output and profiling trace |
prog_output/ |
Disk output from faulty runs |
See the ISSRE'23 AE branch README.
- LLFI Paper
- LLFI Wiki
- LLTFI Wiki
- Udit Kumar Agarwal, Abraham Chan, Karthik Pattabiraman. LLTFI: Framework agnostic fault injection for machine learning applications. ISSRE 2022. PDF
- Udit Kumar Agarwal, Abraham Chan, Karthik Pattabiraman. Resilience Assessment of Large Language Models under Transient Hardware Faults. ISSRE 2023. PDF
@article{Agarwal22LLTFI,
title = {LLTFI: Framework agnostic fault injection for machine learning
applications (Tools and Artifacts Track)},
author = {Agarwal, Udit and Chan, Abraham and Pattabiraman, Karthik},
journal = {International Symposium on Software Reliability Engineering (ISSRE)},
year = {2022},
publisher = {IEEE}
}Read caveats.txt for known limitations and gotchas.
Read CODING_GUIDELINES.md for C++, C, and Python coding conventions.
Read architecture.md for a detailed description of the internal architecture.