Skip to content

Prompt injection classifier that detects potential prompt injection attacks in LLM inputs. Include a simple API and web demo.

Notifications You must be signed in to change notification settings

trstbydsgn/injector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prompt Injection Classifier

A hybrid ML + rule-based system for detecting prompt injection attacks in LLM inputs.

Features

  • 🛡️ Hybrid detection (ML + rule-based)
  • 🎯 10+ attack pattern categories
  • 🚀 REST API with Python client
  • 💻 Interactive web demo
  • 📊 Detailed analysis and reporting

Quick Start

API Server

```bash pip install -r requirements.txt python api/server.py ```

Web Demo

```bash cd web npm install npm start ```

API Usage

```python import requests

response = requests.post( 'http://localhost:5000/v1/classify', json={'input': 'Your text here'} ) print(response.json()) ```

Detection Categories

  • Role Manipulation
  • System Override
  • Instruction Injection
  • Context Switching
  • Jailbreak Keywords
  • Privilege Escalation
  • Output Manipulation
  • Prompt Leaking
  • Delimiter Manipulation
  • Encoded Instructions

License

MIT

About

Prompt injection classifier that detects potential prompt injection attacks in LLM inputs. Include a simple API and web demo.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages