Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment

Paper | Website

Overview

This repository is the official implementation of the Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment paper published at ICML 2026 (Spotlight).

Installation

Create and activate conda environment:

conda create -n sciql python=3.10 -y && conda activate sciql

Install pip dependencies:

python -m pip install --no-cache-dir -r requirements.txt

Setup the environment variables:

conda env config vars set PYTHONPATH="$PYTHONPATH:$PWD"
conda env config vars set MUJOCO_GL=egl
conda deactivate && conda activate sciql

Download the datasets from our website. Organize them in a datasets/ folder at the root of the project:

datasets/
└── diverse_mujoco/
    └── mujoco_halfcheetah-fix/
        └── fix-val.npz
        └── fix.npz
    └── mujoco_halfcheetah-stitch/
        └── stitch-val.npz
        └── stitch.npz
    └── mujoco_halfcheetah-vary/
        └── vary-val.npz
        └── vary.npz
└── traj2d
    └── random_circles-inplace-v0/
    └── random_circles-navigate-v0/

Usage

To reproduce the results, launch the experiments with the following commands:

# For Halfcheetah

python experiments/control/launch.py -cp yamls/diverse_mujoco/bc/jax -cn mujoco_halfcheetah

python experiments/control/launch.py -cp yamls/diverse_mujoco/cbc/jax -cn mujoco_halfcheetah

python experiments/control/launch.py -cp yamls/diverse_mujoco/bcpmi_joint/jax -cn mujoco_halfcheetah

python experiments/control/launch.py -cp yamls/diverse_mujoco/scbc/jax -cn mujoco_halfcheetah

python experiments/control/launch.py -cp yamls/diverse_mujoco/sciql_joint/jax -cn mujoco_halfcheetah

python experiments/control/launch.py -cp yamls/diverse_mujoco/sorl/jax -cn mujoco_halfcheetah

# For Circle2D

python experiments/control/launch.py -cp yamls/traj2d/bc/jax -cn random_circles-v0

python experiments/control/launch.py -cp yamls/traj2d/cbc/jax -cn random_circles-v0

python experiments/control/launch.py -cp yamls/traj2d/bcpmi_joint/jax -cn random_circles-v0

python experiments/control/launch.py -cp yamls/traj2d/scbc/jax -cn random_circles-v0

python experiments/control/launch.py -cp yamls/traj2d/sciql_joint/jax -cn random_circles-v0

python experiments/control/launch.py -cp yamls/traj2d/sorl/jax -cn random_circles-v0

Aknownledgments

This codebase takes inspiration from the jax_corl library.

Citation

Accepted at ICML 2026. Proceedings citation will be updated once available.

@inproceedings{petitbois2026offline,
  title     = {Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment},
  author    = {Petitbois, Mathieu and Portelas, R{\'e}my and Lamprier, Sylvain},
  booktitle = {International Conference on Machine Learning (ICML)},
  year      = {2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
experiments/control		experiments/control
sciql		sciql
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment

Paper | Website

Overview

Installation

Usage

Aknownledgments

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Offline Reinforcement Learning of High-Quality Behaviors Under Robust Style Alignment

Paper | Website

Overview

Installation

Usage

Aknownledgments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages