Skip to content

shallowdream204/DiCo

Repository files navigation

DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling

1CASIA  2UCAS  3ByteDance
NeurIPS 2025 Spotlight
Diffusion ConvNet is Stronger than you Think!

⭐ If DiCo is helpful to your projects, please help star this repo. Thanks! 🤗

performance

🔥 News

  • 2025.9.19: We release code, models and training logs of DiCo.
  • 2025.9.18: DiCo is accepted by NeurIPS 2025 as a spotlight paper! 🎉
  • 2025.5.18: This repo is created.

samples

📷 Results

Our DiCo models consistently require fewer Gflops compared to their Transformer counterparts, while achieving superior generative performance.

Model-iters Resolution CFG FID IS Params FLOPs ckpt log
DiCo-S-400k 256x256 1.0 49.97 31.38 33.1M 4.25G ckpt log
DiCo-B-400k 256x256 1.0 27.20 56.52 130.0M 16.88G ckpt log
DiCo-L-400k 256x256 1.0 13.66 91.37 463.9M 60.24G ckpt log
DiCo-XL-400k 256x256 1.0 11.67 100.42 701.2M 87.30G ckpt log
DiCo-XL-3750k 256x256 1.4 2.05 282.17 701.2M 87.30G ckpt log

🎰 Training

I - Prepare training data

Similar to fast-DiT, we use VAE to extract ImageNet features before starting training:

torchrun --nnodes=1 --nproc_per_node=1 --master_port=1234 extract_features.py \
    --model DiT-XL/2 \
    --data-path /path/to/imagenet/train \
    --features-path /path/to/store/features

II - Training for DiCo

To launch DiCo-XL (256x256) training with 8 GPUs on 1 node:

export WANDB_API_KEY='YOUR_WANDB_API_KEY'
accelerate launch \
    --multi_gpu \
    --num_processes=8 \
    --main_process_port=1234 \
    --mixed_precision=no \
    train_accelerate.py \
    --feature-path=/path/to/store/features \
    --image-size=256 \
    --model-domain=dico \
    --model=DiCo-XL \
    --results-dir=/path/to/store/exp/results \
    --exp-name=DiCo-XL-256

To launch DiCo-XL (256x256) training with 32 GPUs on 4 nodes:

export WANDB_API_KEY='YOUR_WANDB_API_KEY'
accelerate launch \
    --multi_gpu \
    --num_processes=32 \
    --num_machines=4 \
    --main_process_ip=... \
    --main_process_port=1234 \
    --machine_rank=... \
    --mixed_precision=no \
    train_accelerate.py \
    --feature-path=/path/to/store/features \
    --image-size=256 \
    --model-domain=dico \
    --model=DiCo-XL \
    --results-dir=/path/to/store/exp/results \
    --exp-name=DiCo-XL-256

⚡ Evaluation (FID, Inception Score, etc.)

To sample 50K images from our pre-trained DiCo-XL (400K iters, w/o cfg, FID=11.67) model over 8 GPUs, run:

torchrun --nnodes=1 --nproc_per_node=8 --master-port=1234 \
    sample_ddp.py \
    --ckpt=/path/to/DiCo-XL-400K-256x256.pt \
    --model=DiCo-XL \
    --model-domain=dico \
    --cfg-scale=1.0 \
    --global-seed=1234

To sample 50K images from our pre-trained DiCo-XL (3750K iters, w/ cfg=1.4, FID=2.05) model over 8 GPUs, run:

torchrun --nnodes=1 --nproc_per_node=8 --master-port=1234 \
    sample_ddp.py \
    --ckpt=/path/to/DiCo-XL-3750K-256x256.pt \
    --model=DiCo-XL \
    --model-domain=dico \
    --cfg-scale=1.4 \
    --global-seed=1234

These scripts generate a folder of samples as well as a .npz file which can be directly used with ADM's TensorFlow evaluation suite to compute FID, Inception Score and other metrics.

🪪 License

The provided code and pre-trained weights are licensed under the Apache 2.0 license.

🤗 Acknowledgement

This code is based on DiT, fast-DiT and U-DiT. We thank the authors for their awesome work.

📧 Contact

If you have any questions, please feel free to reach me out at shallowdream555@gmail.com.

📖 Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{ai2025dico,
    title={DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling},
    author={Yuang Ai and Qihang Fan and Xuefeng Hu and Zhenheng Yang and Ran He and Huaibo Huang},
    booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
    year={2025},
    url={https://openreview.net/forum?id=UnslcaZSnb}
}

About

[NeurIPS 2025 Spotlight] DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages