Official repository for Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study.
Our AdvMix dataset is available here.
git clone https://github.com/ary4n99/llm-robustness.git
cd llm-robustness
pip install -r requirements.txt
cp example.yaml config.yaml
cp .env.example .env
To run attack pipelines:
python run_pipelines.py --config ./path/to/config --log-level INFO --seed 0
To run semantic integrity analysis:
python semantic_integrity.py
@inproceedings{
agrawal2025enhancing,
title={Enhancing {LLM} Robustness to Perturbed Instructions: An Empirical Study},
author={Aryan Agrawal and Lisa Alazraki and Shahin Honarvar and Thomas Mensink and Marek Rei},
booktitle={ICLR 2025 Workshop on Building Trust in Language Models and Applications},
year={2025},
url={https://openreview.net/forum?id=abllmCsDp8}
}