Skip to content

gerwang/DiffReg-PBIR

Repository files navigation

Diffusion-Based Material Regularization for Physics-Based Inverse Rendering

Official code release for the ECCV 2026 paper "Diffusion-Based Material Regularization for Physics-Based Inverse Rendering."

Jingwang Ling1, Lifan Wu2, Feng Xu3, Shuang Zhao1

1 University of Illinois Urbana-Champaign    2 NVIDIA    3 BNRist and School of Software, Tsinghua University

🌐 Project page

Given multi-view images of a static object under unknown illumination, our method reconstructs a renderer-ready PBR asset — shape, spatially varying material, and an environment map — that relights faithfully under novel illumination.

Physics-based inverse rendering gives an accurate image-formation model but is badly underconstrained: without strong priors, lighting bakes into materials and reconstructions generalize poorly to novel views and lighting. Diffusion models predict plausible materials, but their outputs rarely satisfy the rendering equation and aren't directly usable for physics-based rendering. We bridge the two rather than replacing either: the key idea is to treat a diffusion model's intrinsic G-buffer predictions not as target material values but as a similarity kernel — a regularizer that penalizes material deviations across surface regions where the predictions are near-constant, while leaving the optimization free to match the input images.

Pipeline overview

The method runs in three stages:

  1. Preprocessing — diffusion G-buffer prediction. A conditional diffusion model (e.g. DiffusionRenderer / RGB↔X) predicts per-view intrinsic G-buffers [base_color, roughness, metallic, normal]; the helpers under scripts/preprocess/ convert its per-view output into the *_single G-buffer layout the next stages consume.
  2. Neural surface reconstruction. A voxel-grid SDF is reconstructed by neural volume rendering, supervised by the predicted normals, then meshed with Marching Cubes (neural_surface_recon/).
  3. Physics-based inverse rendering (PBIR). On the reconstructed geometry, spatially varying material and an environment map are jointly optimized with a differentiable Mitsuba 3 renderer, minimizing a photometric loss plus the diffusion-guided material regularizer (diffusion_reg/, method config diffreg).

Repository layout

diffusion_reg/         # Stage 3: differentiable-rendering PBIR (the released method)
  run.py                 #   entrypoint: python diffusion_reg/run.py <configroot> <ckptroot>
  opt.py                 #   gin-configured optimization loop
  configs/diffreg/      #   the released method's config (pipeline.gin + stage gin)
  datasets/              #   NeuralPBIRDataset (builds the irtk scene from Stage-1 outputs)
  models/                #   microfacet + envmap dictionary-field models, losses
  mitsuba_ext/           #   custom Mitsuba 3 BSDFs / AOV integrator
  deps/                  #   git submodules (see below)
neural_surface_recon/    # Stage 2: neural SDF surface reconstruction
scripts/                 # dataset launch + relighting + quantitative-metric scripts
  dtc/ mii/ stanford_orb/ preprocess/ relit/
envmap_mii/              # relighting environment maps used by the MII / DTC evaluation

Submodules (diffusion_reg/deps/)

Submodule Purpose
irtk Scene / model / optimization toolkit; the Mitsuba connector
mitsuba3 Differentiable path-tracing backend (custom fork)
factor_fields Dictionary-field spatial encoding for material / envmap
permutohedral_lattice CUDA bilateral filter for the joint-bilateral material regularizer

Setup

# 1. Clone with submodules
git clone --recursive <repo-url> && cd <repo>
# (or, after a plain clone)
git submodule update --init --recursive

# 2. Create the conda environments
#   (a) diffusion-reg — Stage-3 PBIR + DTC (Mitsuba) relighting
conda env create -f diffusion_reg/environment.yml
conda activate diffusion-reg
pip install torch==2.9.1 torchvision==0.24.1 --index-url https://download.pytorch.org/whl/cu128
# mmcv 1.7.0's legacy setup.py imports pkg_resources at build time, which recent setuptools
# (>=81) no longer provides under build isolation; install an older setuptools and build mmcv
# without isolation. (mmcv 1.7.0 also cannot build under Python 3.13, hence python=3.12 in
# environment.yml.)
pip install "setuptools<80" wheel
pip install --no-build-isolation mmcv==1.7.0
pip install -r diffusion_reg/requirements.txt
#   (b) neural-pbir — Stage-2 surface reconstruction + Stanford-ORB/MII (Blender) relighting
conda create -n neural-pbir python=3.10 -y
conda activate neural-pbir
# pytorch3d, tiny-cuda-nn and the local neural_surface_recon/lib/cuda extension all compile
# from source, so a CUDA toolkit (nvcc) matching the PyTorch CUDA build must be available and a
# GPU visible. Install the toolkit into the env from conda-forge (this provides nvcc; no
# system-wide CUDA install is required):
conda install -c conda-forge cuda-toolkit=12.8 ninja -y
export CUDA_HOME="${CUDA_HOME:-$CONDA_PREFIX}"
# Install a CUDA-matched PyTorch FIRST. Pick a CUDA build that supports your GPU and matches the
# cuda-toolkit above: CUDA 12.8 (cu128) covers everything through Blackwell / RTX 50-series
# (sm_120). On pre-Blackwell GPUs an older stack (e.g. torch 2.4 / cu124) also works, as long as
# the conda cuda-toolkit version is changed to match.
pip install torch==2.9.0 torchvision==0.24.0 --index-url https://download.pytorch.org/whl/cu128

# Build tools for the from-source packages below. mmcv 1.7.0's legacy setup.py imports
# pkg_resources, which recent setuptools (>=81) no longer provides, so pin setuptools<80:
pip install "setuptools<80" wheel

# pytorch3d + tiny-cuda-nn build from source and import torch in their setup.py, so build them
# without build isolation (torch is already installed above). tiny-cuda-nn needs the target
# GPU's compute capability; detect it from the current GPU (e.g. "8.6" -> "86"). Override
# TCNN_CUDA_ARCHITECTURES to build for a different GPU (Blackwell/50x0=120, H100=90, 40x0=89,
# 30x0/A6000=86, A100=80, V100=70).
export TCNN_CUDA_ARCHITECTURES="${TCNN_CUDA_ARCHITECTURES:-$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader | head -1 | tr -d '.')}"
FORCE_CUDA=1 pip install --no-build-isolation "git+https://github.com/facebookresearch/pytorch3d.git@stable"
pip install --no-build-isolation "git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch"

# mmcv 1.7.0 (legacy build) and the local neural_surface_recon/lib/cuda extension (imports torch
# in its setup) must also skip build isolation:
pip install --no-build-isolation mmcv==1.7.0
pip install --no-build-isolation -r requirements.txt

# 3. Build the custom Mitsuba 3 (diffusion-reg env). This fork's custom BSDF / AOV-integrator
#    code is RGB-only, so restrict the build to the RGB variants the method uses. (Mitsuba's
#    default variant set also enables spectral variants, which do not compile against this fork.)
cmake -S diffusion_reg/deps/mitsuba3 -B diffusion_reg/deps/mitsuba3/build -GNinja \
    -DMI_DEFAULT_VARIANTS="scalar_rgb,cuda_ad_rgb"
# Build the Dr.Jit Python extension first. The Python type-stub generation step imports it, but
# the default build graph can schedule stub generation before the extension is linked (a
# parallel-build race); building it explicitly first makes the following full build deterministic.
ninja -C diffusion_reg/deps/mitsuba3/build drjit-python
ninja -C diffusion_reg/deps/mitsuba3/build
# Make `import mitsuba` resolve (do this in every shell, or add to the env's activate.d):
source diffusion_reg/deps/mitsuba3/build/setpath.sh

# 4. Put the repo root and irtk on PYTHONPATH so `import diffusion_reg` / `import irtk` work
export PYTHONPATH="$PWD:$PWD/diffusion_reg:$PWD/diffusion_reg/deps/irtk:$PYTHONPATH"

Both conda envs use the vendored irtk (diffusion_reg/deps/irtk) via this PYTHONPATH — it is not pip-installed. Export it in whichever shell you launch the driver scripts from (the neural-pbir relight/eval scripts import irtk.io too); a login/profile export covers both envs.

Blackwell (RTX 50-series / sm_120) GPUs. The pinned Mitsuba/Dr.Jit predate Blackwell; on sm_120 with a recent OptiX driver, the rendering kernels fail to compile with ptx2llvm ... Unexpected cast from {i32,i1} to float / Target architecture invalid (mitsuba3 #1891). Apply the included Dr.Jit-core patch to the nested submodule before the cmake/ninja steps above (older GPUs do not need it):

git -C diffusion_reg/deps/mitsuba3/ext/drjit/ext/drjit-core \
    apply "$(pwd)/scripts/blackwell_drjit_shfl.patch"

Relighting / Blender. The Stanford-ORB and Synthetic4Relight relighting steps render the reconstructed asset in Blender 3.6 (Linux x86-64). Install it automatically:

bash scripts/setup_blender.sh

This downloads Blender into third_party/, which the relight scripts auto-detect (or set export BLENDER=/path/to/blender to use your own). DTC relighting uses the in-repo Mitsuba renderer and needs no external dependency.

GPU note: rendering uses Mitsuba 3 with the cuda_ad_rgb variant and OptiX, so a CUDA GPU is required. An NVIDIA GPU with hardware ray-tracing (RT) cores is recommended.

Running the method end to end

Stage 2 — neural surface reconstruction (produces mesh.obj + the SDF, with normal supervision from the diffusion G-buffers):

python neural_surface_recon/run_template.py \
    --template neural_surface_recon/configs/template_<dataset>_regnormal.py \
    --savemem \
    --result_name neural_surface_recon_regnormal \
    <path_to_dataroot>

Stage 3 — PBIR (the released method). The entrypoint is always:

python diffusion_reg/run.py diffusion_reg/configs/diffreg <ckptroot>

<ckptroot> is the per-scene result directory (e.g. results/<dataset>/<scene>) that already contains the Stage-2 reconstruction (<ckptroot>/neural_surface_recon_regnormal/mesh.obj), with the diffusion G-buffers (*_single) available in the scene's data directory. Outputs are written to <ckptroot>/diffusion_reg_full_diffreg/ (base_color.exr, roughness.exr, metallic.exr, envmap.exr, mesh.obj).

Per-dataset driver scripts wrap training + relighting + metric evaluation:

Dataset One-case command Relight backend
DTC bash scripts/dtc/dtc_our_method.sh <scene> diffreg Mitsuba (in-repo)
Stanford-ORB bash scripts/stanford_orb/stanford_orb_our_method.sh <scene> diffreg Blender
MII (Synthetic4Relight) bash scripts/mii/mii_our_method.sh <scene> diffreg Blender

To process a whole dataset, loop over the scene directories you extracted under data/ (each scene is a single, self-contained GPU run):

for d in data/dtc/adt_dslr/*/; do
  scene=$(basename "$d")
  bash scripts/dtc/dtc_regnormal_only.sh "$scene"           # Stage 2
  bash scripts/dtc/dtc_our_method.sh     "$scene" diffreg   # Stage 3 + relight + metrics
done

Substitute the stanford_orb / mii driver and data path for the other datasets.

Each scene writes its own metrics under its result directory results/<dataset>/<scene>/diffusion_reg_full_diffreg/ — DTC and Synthetic4Relight relight metrics land in the per-scene relight JSON (relit/ and blender_relit/, respectively). Stanford-ORB metrics are produced by its evaluation flow (below).

Reproducing one case per dataset

Each dataset runs Stage 2 (neural surface reconstruction) then Stage 3 (PBIR + relighting + metrics) for a single scene, and should reproduce the paper's per-scene numbers within Monte-Carlo rendering noise. Run from the repository root, with the scene's G-buffers already prepared (see README_DATA.md). The example scenes shipped in data/ come ready to process; the DTC-Synthetic release zip additionally ships the ground-truth relit images so its metrics are reproducible without the raw GT assets.

# DTC-Synthetic (in-repo Mitsuba relight)
bash scripts/dtc/dtc_regnormal_only.sh TeaPot_B094FQW6Q4_EmeraldGoldTop_scene002    # Stage 2
bash scripts/dtc/dtc_our_method.sh            TeaPot_B094FQW6Q4_EmeraldGoldTop_scene002 diffreg

# Stanford-ORB (Blender relight)
bash scripts/stanford_orb/stanford_orb_regnormal_only.sh cup_scene006    # Stage 2
bash scripts/stanford_orb/stanford_orb_our_method.sh            cup_scene006 diffreg

# Synthetic4Relight / mii (Blender relight)
bash scripts/mii/mii_regnormal_only.sh hotdog                               # Stage 2
bash scripts/mii/mii_our_method.sh            hotdog diffreg

Stanford-ORB benchmark metrics. stanford_orb_our_method.sh only renders the relit images and per-view geometry buffers; the official Stanford-ORB numbers are computed by the benchmark's own code. Clone the Stanford-ORB repository, create its orb conda environment as described there, and point ORB_ROOT at it. Evaluation needs a GPU (the benchmark's LPIPS runs on CUDA).

Our relighting outputs are written as 4-channel (RGBA) EXRs, but the stock Stanford-ORB EXR loader asserts exactly 3 channels and aborts on them. Apply the one included patch to the fresh clone once (it drops the alpha channel before the assert):

git -C "$ORB_ROOT" apply "$(pwd)/scripts/stanford_orb/rgba_exr.patch"

To evaluate a single case, pass its scene name (the relighting metric additionally needs the object's other lighting scenes — see README_DATA.md):

ORB_ROOT=/path/to/Stanford-ORB \
  bash scripts/stanford_orb/eval_method.sh diffusion_reg_full_diffreg cup_scene006

The first argument is the result-directory name diffusion_reg_full_<method> (not the bare method name). Per-scene metrics are written to results/stanford_orb/eval_inputs_diffusion_reg_full_diffreg_results.json. Omit the scene argument to evaluate the full 42-scene benchmark (after training all scenes).

Reference numbers for these cases (the paper's diffreg result):

Dataset / scene Metric Value
DTC TeaPot_B094FQW6Q4_EmeraldGoldTop_scene002 relit PSNR / SSIM / LPIPS 40.54 / 0.993 / 0.011
Stanford-ORB cup_scene006 relit PSNR-LDR / PSNR-HDR / SSIM / LPIPS 36.01 / 27.39 / 0.987 / 0.030
MII hotdog relit PSNR / aligned-albedo PSNR 29.06 / 26.83

The Stanford-ORB numbers are the benchmark's relighting (novel-light) scores. Because PBIR is a Monte-Carlo optimization, a fresh run reproduces these within rendering noise (typically within a few tenths of a dB), not bit-for-bit.

Previewing the relit results

Once the three cases above have produced their relit images, assemble a side-by-side preview (the HDR Stanford-ORB frame is tone-mapped to sRGB automatically):

python scripts/make_examples_figure.py    # -> assets/example_relight.png

One relit frame per example case, rendered under novel illumination.

Acknowledgments

The neural surface reconstruction stage builds on Neural-PBIR (Sun et al., ICCV 2023). The differentiable renderer is a custom fork of Mitsuba 3; material and lighting encodings use Factor Fields; the bilateral material regularizer uses a CUDA permutohedral-lattice filter (Monteiro et al.).

Citation

@inproceedings{diffusion_material_regularization_eccv,
  author    = {Jingwang Ling and Lifan Wu and Feng Xu and Shuang Zhao},
  title     = {Diffusion-Based Material Regularization for Physics-Based Inverse Rendering},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year      = {2026}
}

If you use the neural surface reconstruction stage, please also cite:

@inproceedings{neuralpbir2023,
  author    = {Cheng Sun and Guangyan Cai and Zhengqin Li and Kai Yan and Cheng Zhang and
               Carl Marshall and Jia-Bin Huang and Shuang Zhao and Zhao Dong},
  title     = {Neural-PBIR Reconstruction of Shape, Material, and Illumination},
  booktitle = {ICCV},
  year      = {2023}
}

License

Released under CC BY-NC 4.0 (NonCommercial only) — see LICENSE, inherited from Meta's Digital Twin Catalog. Vendored third-party projects under diffusion_reg/deps/ keep their own licenses. See NOTICE for full attribution.

About

Physics-based inverse rendering regularized by a diffusion prior used as a material similarity kernel — not fixed targets — to recover relightable geometry, materials, and lighting from multi-view images. ECCV 2026.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors