R3DPA: Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation

Nicolas Sereyjol-Garros · Ellington Kirby · Victor Besnier · Nermin Samet

Valeo.ai, Paris, France

Accepted at ICRA 2026

Overview

LiDAR scene synthesis is an emerging solution to scarcity in 3D data for robotic tasks such as autonomous driving. Recent approaches employ diffusion or flow matching models to generate realistic scenes, but 3D data remains limited compared to RGB datasets with millions of samples. We introduce R3DPA, the first LiDAR scene generation method to unlock image-pretrained priors for LiDAR point clouds, and leverage self-supervised 3D representations for state-of-the-art results. Specifically, we (i) align intermediate features of our generative model with self-supervised 3D features, which substantially improves generation quality; (ii) transfer knowledge from largescale image-pretrained generative models to LiDAR generation, mitigating limited LiDAR datasets; and (iii) enable point cloud control at inference for object inpainting and scene mixing with solely an unconditional model. On the KITTI-360 benchmark R3DPA achieves state of the art performance.

LiDAR point cloud generation from range images commonly follows a two-stage approach: the VAE is trained independently and then frozen, while the generative model is trained on its latent space. In contrast, our method leverages priors from a backbone pretrained on large-scale image datasets. The alignment step trains the VAE from scratch while initializing and freezing the generative model with pretrained weights. This stage ensures that the latent space of our newly trained VAE remains compatible with the knowledge of the pretrained generative model. We then jointly optimize the VAE encoder and the generative model under the supervision of 3D representations. Range VAE denotes a model trained on range images.

📚 Citation

If you find our work useful, please consider citing:

@inproceedings{sereyjol2026r3dpa,
      title={Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation}, 
      author={Nicolas Sereyjol-Garros and Ellington Kirby and Victor Besnier and Nermin Samet},
      year={2026},
      booktitle={ICRA},
}

Getting Started

1. Environment Setup

To set up our environment, please run:

git clone https://github.com/valeoai/R3DPA.git
cd R3DPA
conda env create -f environment.yml
conda activate r3dpa

2. Prepare the training data

Put the KITTI-360 dataset under the dataset folder.

Download ScaLR

Install WaffleIron package

cd ..
git clone https://github.com/valeoai/WaffleIron
cd WaffleIron
pip install -e ./
cd ../R3DPA

Pre compute features

python feature_extraction/preprocess_scalr_fearures.py \
--dataset-path dataset \
--model-path pretrained_weights/scalr/WI_768-DINOv2_ViT_L_14-NS_KI_PD

3. Pretrain the generative model on RGB images

Follow the steps described in REPA-E

Or download our pretrained weights

3. Train the R3DPA model

End-to-end training from scratch

bash scripts/train_e2e.sh

VAE alignment

Put in scripts/train_vae_align.sh the right SiT pretrained on RGB images checkpoint path and run

bash scripts/train_vae_align.sh

End-to-end tuning from pretrained weights

bash scripts/tuning_e2e.sh

5. Generate samples

To generate samples and save them in a .npz file for evaluation, run the following script after after making sure the parameters match your model path.

bash scripts/sample.sh

6. Evaluation

Download statistics from the release or recompute them with the following command.

python -m eval.extract_logits_dataset \
    --save_path log/activations \
    --dataset_path dataset/KITTI-360

Install the following packages and run the evaluation script.

apt-get install libsparsehash-dev
pip install git+https://github.com/mit-han-lab/[email protected]

python evaluate.py --config-path configs/eval/ablations/r3dpa.yaml

Quantitative Results

Method	FRID ×10⁰	FLD ×10⁻¹	FSVD ×10⁰	FPVD ×10⁰	JSD ×10⁻²	MMD ×10⁻⁵
UltraLiDAR	–	–	73.59	65.83	74.72	123.30
LiDM	47.33	10.19	16.01	17.36	19.17	11.32
LiDM w/ APE	42.09	9.76	13.68	13.86	11.69	9.95
R2DM	15.54	7.89	12.67	13.21	5.78	8.50
R2Flow	8.87	8.36	20.80	20.27	5.97	7.84
R3DPA (ours)	8.46	6.34	9.83	11.00	5.67	8.72

Bold = best result; underlined = second-best result.
FRID and FLD measure generation quality in the range image level.
FSVD and FPVD measure quality in the point-cloud space.
JSD and MMD evaluate similarity in the bird’s-eye view.

Acknowledgement

This codebase is largely built upon:

We sincerely thank the authors for making their work publicly available.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
configs		configs
eval		eval
feature_extraction		feature_extraction
loss		loss
models		models
scripts		scripts
.gitignore		.gitignore
README.md		README.md
cache_latents.py		cache_latents.py
dataset.py		dataset.py
dataset_kitti.py		dataset_kitti.py
environment.yml		environment.yml
evaluate.py		evaluate.py
generate.py		generate.py
preprocessing.py		preprocessing.py
samplers.py		samplers.py
save_vae_weights.py		save_vae_weights.py
train_ae_only.py		train_ae_only.py
train_ldm_only.py		train_ldm_only.py
train_repae.py		train_repae.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R3DPA: Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation

Overview

📚 Citation

Getting Started

1. Environment Setup

2. Prepare the training data

3. Pretrain the generative model on RGB images

3. Train the R3DPA model

5. Generate samples

6. Evaluation

Quantitative Results

Acknowledgement

About

Uh oh!

Releases 1

Packages

Languages

valeoai/R3DPA

Folders and files

Latest commit

History

Repository files navigation

R3DPA: Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation

Overview

📚 Citation

Getting Started

1. Environment Setup

2. Prepare the training data

3. Pretrain the generative model on RGB images

3. Train the R3DPA model

5. Generate samples

6. Evaluation

Quantitative Results

Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages