ML-Powered Profanity Detection for the Modern Web
This monorepo maintains the following packages:
| Package | Version | Description |
|---|---|---|
| glin-profanity | Core profanity filter for JavaScript/TypeScript | |
| glin-profanity | Core profanity filter for Python | |
| glin-profanity-mcp | MCP server for AI assistants (Claude, Cursor, etc.) | |
| openclaw-profanity | Plugin for OpenClaw/Moltbot AI agents |
Most profanity filters are trivially bypassed. Users type f*ck, sh1t, or fսck (with Cyrillic characters) and walk right through. Glin Profanity doesn't just check against a word list—it understands evasion tactics.
┌─────────────────────────────────────────────────────────────────────────────┐
│ GLIN PROFANITY v3 │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Input Text ──► Unicode ──► Leetspeak ──► Dictionary ──► ML │
│ Normalization Detection Matching Check│
│ (homoglyphs) (f4ck→fuck) (23 langs) (opt) │
│ │
│ "fսck" ──► "fuck" ──► "fuck" ──► MATCH ──► ✓ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Tested on Node.js 20, M1 MacBook Pro, single-threaded:
| Operation | Glin Profanity | bad-words | leo-profanity | obscenity |
|---|---|---|---|---|
| Simple check | 21M ops/sec | 890K ops/sec | 1.2M ops/sec | 650K ops/sec |
| With leetspeak | 8.5M ops/sec | N/A | N/A | N/A |
| Multi-language (3) | 18M ops/sec | N/A | 400K ops/sec | N/A |
| Unicode normalization | 15M ops/sec | N/A | N/A | N/A |
| Feature | Glin Profanity | bad-words | leo-profanity | obscenity |
|---|---|---|---|---|
Leetspeak detection (f4ck, sh1t) |
Yes | No | No | Partial |
| Unicode homoglyph detection | Yes | No | No | No |
| ML toxicity detection | Yes (TensorFlow.js) | No | No | No |
| Multi-language support | 23 languages | English only | 14 languages | English only |
| Result caching (LRU) | Yes | No | No | No |
| Severity levels | Yes | No | No | No |
| React hook | Yes | No | No | No |
| Python package | Yes | No | No | No |
| TypeScript types | Full | Partial | Partial | Full |
| Bundle size (minified) | 12KB + dictionaries | 8KB | 15KB | 6KB |
| Active maintenance | Yes | Limited | Limited | Limited |
JavaScript/TypeScript
npm install glin-profanityPython
pip install glin-profanityJavaScript
import { checkProfanity, Filter } from 'glin-profanity';
// Simple check
const result = checkProfanity("This is f4ck1ng bad", {
detectLeetspeak: true,
languages: ['english']
});
result.containsProfanity // true
result.profaneWords // ['fucking']
// With replacement
const filter = new Filter({
replaceWith: '***',
detectLeetspeak: true
});
filter.checkProfanity("sh1t happens").processedText // "*** happens"Python
from glin_profanity import Filter
filter = Filter({"languages": ["english"], "replace_with": "***"})
filter.is_profane("damn this") # True
filter.check_profanity("damn this") # Full result objectReact
import { useProfanityChecker } from 'glin-profanity';
function ChatInput() {
const { result, checkText } = useProfanityChecker({
detectLeetspeak: true
});
return (
<input onChange={(e) => checkText(e.target.value)} />
{result?.containsProfanity && <span>Clean up your language</span>}
);
}flowchart LR
subgraph Input
A[Raw Text]
end
subgraph Processing
B[Unicode Normalizer]
C[Leetspeak Decoder]
D[Word Tokenizer]
end
subgraph Detection
E[Dictionary Matcher]
F[Fuzzy Matcher]
G[ML Toxicity Model]
end
subgraph Output
H[Result Object]
end
A --> B --> C --> D
D --> E --> H
D --> F --> H
D -.->|Optional| G -.-> H
const filter = new Filter({
detectLeetspeak: true,
leetspeakLevel: 'aggressive' // basic | moderate | aggressive
});
filter.isProfane('f4ck'); // true
filter.isProfane('5h1t'); // true
filter.isProfane('@$$'); // true
filter.isProfane('ph.u" "ck'); // true (aggressive mode)const filter = new Filter({ normalizeUnicode: true });
filter.isProfane('fսck'); // true (Armenian 'ս' → 'u')
filter.isProfane('shіt'); // true (Cyrillic 'і' → 'i')
filter.isProfane('ƒuck'); // true (Latin 'ƒ' → 'f')import { loadToxicityModel, checkToxicity } from 'glin-profanity/ml';
await loadToxicityModel({ threshold: 0.9 });
const result = await checkToxicity("You're the worst player ever");
// { toxic: true, categories: { toxicity: 0.92, insult: 0.87, ... } }23 languages with curated dictionaries:
| Arabic | Chinese | Czech | Danish |
| Dutch | English | Esperanto | Finnish |
| French | German | Hindi | Hungarian |
| Italian | Japanese | Korean | Norwegian |
| Persian | Polish | Portuguese | Russian |
| Spanish | Swedish | Thai | Turkish |
| Document | Description |
|---|---|
| Getting Started | Installation and basic usage |
| API Reference | Complete API documentation |
| Framework Examples | React, Vue, Angular, Express, Next.js |
| Advanced Features | Leetspeak, Unicode, ML, caching |
| ML Guide | TensorFlow.js integration |
| Changelog | Version history |
Run the interactive playground locally to test profanity detection:
# Clone the repo
git clone https://github.com/GLINCKER/glin-profanity.git
cd glin-profanity/packages/js
# Install dependencies
npm install
# Start the local testing server
npm run dev:playgroundOpen http://localhost:4000 to access the testing interface with:
- Real-time profanity detection
- Toggle leetspeak, Unicode normalization, ML detection
- Multi-language selection
- Visual results with severity indicators
| Application | How Glin Profanity Helps |
|---|---|
| Chat platforms | Real-time message filtering with React hook |
| Gaming | Detect obfuscated profanity in player names/chat |
| Social media | Scale moderation with ML-powered detection |
| Education | Maintain safe learning environments |
| Enterprise | Filter internal communications |
| AI/ML pipelines | Clean training data before model ingestion |
Glin Profanity includes an MCP (Model Context Protocol) server that enables AI assistants like Claude Desktop, Cursor, Windsurf, and other MCP-compatible tools to use profanity detection as a native tool.
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"glin-profanity": {
"command": "npx",
"args": ["-y", "glin-profanity-mcp"]
}
}
}Cursor (.cursor/mcp.json):
{
"mcpServers": {
"glin-profanity": {
"command": "npx",
"args": ["-y", "glin-profanity-mcp"]
}
}
}| Tool | Description |
|---|---|
check_profanity |
Check text for profanity with detailed results |
censor_text |
Censor profanity with configurable replacement |
analyze_context |
Context-aware analysis with domain whitelists |
batch_check |
Check multiple texts in one operation |
validate_content |
Content validation with safety scoring (0-100) |
detect_obfuscation |
Detect leetspeak and Unicode tricks |
get_supported_languages |
List all 24 supported languages |
explain_match |
Explain why text was flagged with reasoning |
suggest_alternatives |
Suggest clean alternatives for profane content |
analyze_corpus |
Analyze up to 500 texts for moderation stats |
compare_strictness |
Compare results across strictness levels |
create_regex_pattern |
Generate regex patterns for custom detection |
track_user_message |
Track user messages for repeat offender detection |
get_user_profile |
Get moderation profile for a specific user |
get_high_risk_users |
List users with high violation rates |
reset_user_profile |
Reset a user's moderation history |
stream_check |
Real-time streaming profanity check |
stream_batch |
Stream multiple texts with live results |
get_stream_stats |
Get streaming session statistics |
Plus 4 workflow prompts and 5 reference resources for guided AI interactions.
"Check this user comment for profanity using glin-profanity"
"Validate this blog post content with high strictness"
"Batch check these 50 messages for any inappropriate content"
"Analyze this medical text with the medical domain context"
See the full MCP documentation for setup instructions and examples.
See our ROADMAP.md for planned features including:
- Streaming support for real-time chat
- OpenAI function calling integration
- Image OCR for profanity in images
- Edge deployment (Cloudflare Workers, Vercel Edge)
- More framework integrations
MIT License - free for personal and commercial use.
Enterprise licensing with SLA and support available from GLINCKER.
See CONTRIBUTING.md for guidelines. We welcome:
- Bug reports and fixes
- New language dictionaries
- Performance improvements
- Documentation updates