Skip to content

lisaalaz/llm-robustness

 
 

Repository files navigation

Enhancing LLM Robustness to Perturbed Instructions

Official repository for Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study.

Our AdvMix dataset is available here.

Instructions

Setting up the environment

git clone https://github.com/ary4n99/llm-robustness.git
cd llm-robustness
pip install -r requirements.txt

cp example.yaml config.yaml
cp .env.example .env

Running the code

To run attack pipelines:

python run_pipelines.py --config ./path/to/config --log-level INFO --seed 0

To run semantic integrity analysis:

python semantic_integrity.py

Citation

@inproceedings{
      agrawal2025enhancing,
      title={Enhancing {LLM} Robustness to Perturbed Instructions: An Empirical Study},
      author={Aryan Agrawal and Lisa Alazraki and Shahin Honarvar and Thomas Mensink and Marek Rei},
      booktitle={ICLR 2025 Workshop on Building Trust in Language Models and Applications},
      year={2025},
      url={https://openreview.net/forum?id=abllmCsDp8}
}

About

Official repository for the paper 'Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study'.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%