Extension Challenges

You've built the base project. Now make it yours by extending it with new features.

These challenges are ordered by difficulty. Start with the easier ones to build confidence, then tackle the harder ones when you want to dive deeper.

Easy Challenges

Challenge 1: Add Shift-by-Words Support

What to build: Instead of shifting every letter by the same amount, shift the first letter by 1, second letter by 2, third by 3, etc. This is called an autokey variant.

Why it's useful: This breaks frequency analysis. The same plaintext letter encrypts to different ciphertext letters depending on position. Much stronger than basic Caesar.

What you'll learn:

How position-dependent encryption works
Why polyalphabetic ciphers are harder to crack
Modifying the core cipher loop

Hints:

Look at cipher.py:43-46 - You'll need to pass both char and its position to _shift_char()
Use enumerate() to get character positions
The key becomes (self.key + position) % ALPHABET_SIZE

Test it works:

caesar-cipher encrypt "AAA" --key 1
# Should give "ABC" not "BBB"

Challenge 2: Support Custom Alphabets

What to build: Let users specify their own alphabet, like "QWERTYUIOPASDFGHJKLZXCVBNM" (keyboard order) instead of "ABC...Z".

Why it's useful: Custom alphabets are used in actual historical ciphers (like cipher disk devices). This teaches you about cipher flexibility.

What you'll learn:

How to validate custom input
Why alphabet order doesn't affect Caesar's weakness
Interface design for optional parameters

Hints:

The CaesarCipher.__init__() already accepts an alphabet parameter in cipher.py:16
Add a --alphabet option in main.py
Validate that alphabet has 26 unique characters

Test it works:

caesar-cipher encrypt "HELLO" --key 1 --alphabet "ZYXWVUTSRQPONMLKJIHGFEDCBA"
# Should shift using reverse alphabet

Challenge 3: Add Statistics Display

What to build: A stats command that shows letter frequency distribution for any text, with a visual bar chart in the terminal.

Why it's useful: Visualizing frequency makes it obvious why frequency analysis works. You'll see the E and T spikes immediately.

What you'll learn:

Data visualization in the terminal
Using Rich's progress bars or custom formatting
Presenting statistical information clearly

Hints:

Use collections.Counter like analyzer.py:30
Rich can make bar charts with █ characters repeated N times
Sort letters by frequency to make patterns obvious

Test it works:

caesar-cipher stats "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG"
# Should show E and O as most frequent

Intermediate Challenges

Challenge 4: Implement ROT13 Encoding Mode

What to build: A special rot13 command that's optimized for the common shift=13 case. Make it bidirectional (ROT13 of ROT13 is the original).

Real world application: ROT13 is actually used on Reddit, forums, and in email for spoilers. It's not security, just obfuscation.

What you'll learn:

Why shift=13 is special (it's its own inverse in a 26-letter alphabet)
Command aliasing in CLI tools
Specialized implementations vs general ones

Implementation approach:

Add command to main.py
- Files to create: None (add to existing file)
- Files to modify: main.py (add @app.command() for rot13)
Optimize for the special case
- Since key is always 13, you can hardcode it
- No need for separate encrypt/decrypt (same operation)
Test edge cases:
- What if text has numbers or punctuation?
- Does it handle Unicode properly?

Hints:

ROT13 is just CaesarCipher(key=13)
The command can be simpler than encrypt/decrypt since no key argument
Make it work on stdin for piping: echo "SECRET" | caesar-cipher rot13

Extra credit: Support ROT47 which shifts all printable ASCII characters, not just letters.

Challenge 5: Add Dictionary-Based Ranking

What to build: Instead of just frequency analysis, check if the decrypted text contains valid English words from a word list. Rank candidates by number of recognized words.

Why it's useful: Frequency analysis can be fooled by short text or technical jargon. Dictionary checking is more robust for short messages.

What you'll learn:

Working with word lists (use /usr/share/dict/words on Unix or download one)
Combining multiple scoring methods
Performance optimization (loading and searching word lists)

Implementation approach:

Load word list into a set for O(1) lookup
- Create dictionary.py with a load_words() function
- Store as a set for fast word in dictionary checks
Score by word matches
- Split decrypted text into words
- Count how many are in the dictionary
- Normalize by total words to get a percentage
Combine with frequency scoring
- Weight dictionary score and frequency score
- Tune weights: maybe 70% dictionary, 30% frequency

Hints:

Download word list: curl https://github.com/dwyl/english-words/raw/master/words_alpha.txt > words.txt
Case normalize: word.lower() in dictionary
Handle punctuation: text.replace('!', '').replace('?', '').split()

Challenge 6: Crack Multi-Language Ciphers

What to build: Support cracking ciphers in languages other than English by loading different frequency tables. Add Spanish, French, German support.

Real world application: Frequency analysis is universal. Every language has characteristic letter patterns. Spanish uses Ñ, German uses Ä/Ö/Ü, French has accents.

What you'll learn:

How frequency distributions differ across languages
Unicode handling in Python
Designing multi-language support

Implementation:

Add frequency tables to constants.py
- Research Spanish letter frequencies online
- Store in dicts like SPANISH_LETTER_FREQUENCIES
Update analyzer to accept language parameter

   analyzer = FrequencyAnalyzer(language='spanish')

CLI support

   caesar-cipher crack "SADDW FW UYMJYJ" --language spanish

Gotchas:

Extended alphabets (Spanish has 27 letters with Ñ)
Case sensitivity for accented characters
Do you normalize 'É' to 'E' or treat them separately?

Advanced Challenges

Challenge 7: Build a Vigenère Cipher Implementation

What to build: A polyalphabetic cipher that uses a keyword to determine different shifts for different positions. Much harder to break than Caesar.

Why this is hard: Vigenère was called "le chiffre indéchiffrable" (the indecipherable cipher) for centuries. Breaking it requires finding the key length first, then frequency analysis on each position.

What you'll learn:

Polyalphabetic substitution
Kasiski examination for finding key length
Index of coincidence statistical test
Multi-step cryptanalysis

Architecture changes needed:

Add to cipher layer:
┌─────────────────────┐
│  VigenereCipher     │
│  - repeating key    │
│  - position logic   │
└─────────────────────┘

Add to analysis layer:
┌─────────────────────┐
│  VigenereAnalyzer   │
│  - Kasiski method   │
│  - IC calculation   │
└─────────────────────┘

Implementation steps:

Research phase
- Read about Vigenère cipher algorithm
- Understand Kasiski examination
- Look at index of coincidence formula
Design phase
- Should VigenereCipher inherit from CaesarCipher? Probably not, different enough.
- How to handle key wrapping when key is shorter than text?
- Store key as string or list of shifts?
Implementation phase
- Start with encryption: repeat key to match text length
- Add decryption: same but subtract shifts
- Add Kasiski exam: find repeated sequences, calculate GCD of distances
Testing phase
- Unit test encryption with various key lengths
- Test with known ciphertext (Vigenère challenges online)
- Benchmark: how long to crack 100-char message?

Gotchas:

Key "CAT" means shifts of [2, 0, 19] (C=2, A=0, T=19)
Non-letters shouldn't advance key position
Cracking needs at least 100-200 characters of ciphertext

Resources:

Wikipedia: Vigenère cipher - Good overview of algorithm
"Breaking the Vigenère Cipher" paper - Detailed cryptanalysis methods

Challenge 8: Implement Frequency Analysis Visualization

What to build: A web-based tool that shows live frequency charts as you type ciphertext. Use Flask for backend, Chart.js for visualization.

Estimated time: 1-2 days including learning frontend stuff.

Prerequisites: You should have completed the statistics display challenge first. This builds on that concept.

What you'll learn:

Building web UIs for crypto tools
Real-time data visualization
Connecting Python analysis to JavaScript charts

Planning this feature:

Before you code, think through:

How does frontend send text to backend? (POST request with JSON)
How fast can you compute frequencies for 10kb of text? (Should be instant)
Do you need websockets or is HTTP enough? (HTTP fine for this)

High level architecture:

┌──────────────┐         ┌──────────────┐
│   Browser    │◄────────┤   Flask      │
│  Chart.js    │  JSON   │  analyzer.py │
└──────────────┘         └──────────────┘
      ▲                          │
      │                          │
      └──────────┬───────────────┘
            Frequencies

Implementation phases:

Phase 1: Flask Backend (2-3 hours)

Create web.py with Flask app
Add /analyze endpoint that takes text, returns frequencies
Reuse FrequencyAnalyzer from existing code

Phase 2: Frontend (3-4 hours)

HTML page with textarea for ciphertext
JavaScript to send text on keyup
Chart.js bar chart to display frequencies

Phase 3: Add Caesar Decryption (2-3 hours)

Button to crack the ciphertext
Display all 26 candidates with scores
Highlight best match

Phase 4: Polish (2-3 hours)

Add loading indicators
Show character count
Dark mode toggle

Testing strategy:

Manual testing: Type various ciphertexts, verify charts update
Check with very long text (10kb+) to ensure no lag
Test on different browsers

Known challenges:

Chart redrawing performance
- Problem: Recreating chart on every keystroke is slow
- Hint: Update chart data instead of destroying and recreating
Handling empty input
- Problem: Empty text causes division by zero
- Hint: Return empty array when text is empty

Success criteria: Your implementation should:

Update frequency chart in real-time as you type
Show all 26 crack candidates with scores
Handle 10,000 character inputs smoothly
Work on mobile browsers
Display within 100ms of text input

Mix and Match

Combine features for bigger projects:

Project Idea 1: Multi-Cipher Tool

Combine Challenge 7 (Vigenère) + Challenge 6 (multi-language)
Add Playfair cipher
Result: Swiss Army knife for classical cryptography

Project Idea 2: Cryptanalysis Suite

Combine Challenge 5 (dictionary) + Challenge 4 (ROT13) + Challenge 8 (visualization)
Add automated decryption (try all methods)
Result: Tool that takes any ciphertext and figures out what cipher was used

Real World Integration Challenges

Integrate with Online Cipher Challenges

The goal: Make the tool able to fetch ciphertext from CryptoPals or other online CTF challenges and automatically crack them.

What you'll need:

HTTP client (requests library)
Parsing HTML or JSON responses
Handling rate limits

Implementation plan:

Add --url option to crack command
Fetch ciphertext from URL
Run crack and submit answer back

Watch out for:

CAPTCHA on challenge sites
Different encoding (base64, hex)
Throttling after too many requests

Deploy as a Telegram Bot

The goal: Let users encrypt/decrypt messages via Telegram chat.

What you'll learn:

Building chatbots
Stateless conversation handling
API integration

Steps:

Register bot with BotFather on Telegram
Use python-telegram-bot library
Parse commands: /encrypt <text> <key>
Return formatted results

Production checklist:

Handle errors gracefully (show user-friendly messages)
Rate limit per user (prevent spam)
Log usage for debugging
Deploy to server (Heroku, AWS Lambda)

Performance Challenges

Challenge: Handle 1MB Files Instantly

The goal: Make crack command work on huge files without noticeable delay.

Current bottleneck: Frequency analysis runs on full text 26 times. For 1MB, that's 26MB of processing.

Optimization approaches:

Approach 1: Sample Instead of Full Analysis

How: Only analyze first 10,000 characters
Gain: 100x speedup on large files
Tradeoff: Less accurate if beginning isn't representative

Approach 2: Parallel Processing

How: Use multiprocessing to test all 26 shifts simultaneously
Gain: Near-linear speedup with CPU cores
Tradeoff: Overhead for small files makes it slower

Approach 3: Optimize Chi-Squared Calculation

How: Cache expected frequencies, use numpy for math
Gain: 2-3x speedup
Tradeoff: Adds numpy dependency

Benchmark it:

# Generate large file
python -c "import random, string; print(''.join(random.choices(string.ascii_uppercase, k=1000000)))" > large.txt

# Time it
time caesar-cipher crack --input-file large.txt

Target metrics:

1MB file: Under 1 second
10MB file: Under 10 seconds

Challenge: Reduce Memory Usage

The goal: Crack huge files without loading them entirely into memory.

Profile first:

python -m memory_profiler main.py crack --input-file huge.txt

Common optimization areas:

Streaming file read instead of read_text()
Generator expressions instead of lists
Don't store all 26 full decryptions, just the letter frequencies

Security Challenges

Challenge: Add HMAC for Message Authentication

What to implement: Generate an HMAC (keyed hash) alongside the ciphertext so recipients can verify messages weren't tampered with.

Threat model: This protects against:

Message modification (attacker changing ciphertext)
Forgery (attacker creating fake messages)

Implementation:

import hmac
import hashlib

def encrypt_and_mac(plaintext: str, key: int, mac_key: bytes) -> tuple[str, str]:
    cipher = CaesarCipher(key=key)
    ciphertext = cipher.encrypt(plaintext)
    mac = hmac.new(mac_key, ciphertext.encode(), hashlib.sha256).hexdigest()
    return ciphertext, mac

Testing the security:

Try to modify ciphertext without updating MAC
Attempt to forge MAC with wrong key
Verify legitimate messages pass validation

Challenge: Implement Timing Attack Protection

The goal: Make decryption take constant time regardless of whether key is correct. Prevents attackers from using timing to guess keys.

Threat model: Currently, incorrect keys might fail faster than correct ones (early exit on validation). An attacker measuring response times could exploit this.

Remediation:

Always decrypt completely even if you know it's wrong
Add random delay to normalize timing
Use hmac.compare_digest() for MAC comparison (constant time)

Contribution Ideas

Finished a challenge? Share it back:

Fork the repo
Implement your extension in a branch like feature/vigenere-cipher
Document it - Add to learn folder showing how your extension works
Submit a PR with:
- Your implementation
- Tests covering edge cases
- Documentation in learn/
- Example usage in README

Good extensions might get merged into the main project.

Challenge Yourself Further

Build Something New

Use the concepts you learned here to build:

Playfair cipher - Digraph substitution using 5x5 key square
Bifid cipher - Combines substitution and transposition
Four-square cipher - Uses four 5x5 matrices
Hill cipher - Matrix multiplication based polyalphabetic cipher

Study Real Implementations

Compare your implementation to production tools:

CyberChef - How do they structure multi-cipher support?
John the Ripper - How do they optimize brute force?
Cryptool - How do they present educational content?

Read their code, understand their tradeoffs, steal their good ideas.

Write About It

Document your extension:

Blog post: "How I Built a Vigenère Cracker"
Tutorial: "Breaking Classical Ciphers with Python"
Comparison: "Caesar vs Vigenère vs Enigma: Complexity Analysis"

Teaching others is the best way to verify you understand it.

Getting Help

Stuck on a challenge?

Debug systematically
- What did you expect? "Frequency analysis should rank 'HELLO' first"
- What actually happened? "It ranked gibberish higher"
- Smallest test case? "Input: 'KHOOR', Expected shift: 3, Got shift: 7"
Read the existing code
- The analyzer uses chi-squared - how does that work?
- Look at test_analyzer.py - what cases are tested?
Search for similar problems
- Google: "python frequency analysis not working short text"
- StackOverflow: [cryptography] [python] tags
Ask for help
- Post in GitHub discussions
- Include: what you tried, what happened, what you expected
- Show code: "Here's my _shift_char() modification, it doesn't wrap correctly"

Challenge Completion

Track your progress:

Easy Challenge 1: Autokey variant
Easy Challenge 2: Custom alphabets
Easy Challenge 3: Stats display
Intermediate Challenge 4: ROT13 mode
Intermediate Challenge 5: Dictionary ranking
Intermediate Challenge 6: Multi-language
Advanced Challenge 7: Vigenère cipher
Advanced Challenge 8: Web visualization

Completed all of them? You've mastered classical cryptography. Time to learn modern cryptography (AES, RSA, elliptic curves) or contribute back to this project with your extensions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extension Challenges

Easy Challenges

Challenge 1: Add Shift-by-Words Support

Challenge 2: Support Custom Alphabets

Challenge 3: Add Statistics Display

Intermediate Challenges

Challenge 4: Implement ROT13 Encoding Mode

Challenge 5: Add Dictionary-Based Ranking

Challenge 6: Crack Multi-Language Ciphers

Advanced Challenges

Challenge 7: Build a Vigenère Cipher Implementation

Challenge 8: Implement Frequency Analysis Visualization

Mix and Match

Real World Integration Challenges

Integrate with Online Cipher Challenges

Deploy as a Telegram Bot

Performance Challenges

Challenge: Handle 1MB Files Instantly

Challenge: Reduce Memory Usage

Security Challenges

Challenge: Add HMAC for Message Authentication

Challenge: Implement Timing Attack Protection

Contribution Ideas

Challenge Yourself Further

Build Something New

Study Real Implementations

Write About It

Getting Help

Challenge Completion

Uh oh!

FilesExpand file tree

04-CHALLENGES.md

Latest commit

History

04-CHALLENGES.md

File metadata and controls

Extension Challenges

Easy Challenges

Challenge 1: Add Shift-by-Words Support

Challenge 2: Support Custom Alphabets

Challenge 3: Add Statistics Display

Intermediate Challenges

Challenge 4: Implement ROT13 Encoding Mode

Challenge 5: Add Dictionary-Based Ranking

Challenge 6: Crack Multi-Language Ciphers

Advanced Challenges

Challenge 7: Build a Vigenère Cipher Implementation

Challenge 8: Implement Frequency Analysis Visualization

Mix and Match

Real World Integration Challenges

Integrate with Online Cipher Challenges

Deploy as a Telegram Bot

Performance Challenges

Challenge: Handle 1MB Files Instantly

Challenge: Reduce Memory Usage

Security Challenges

Challenge: Add HMAC for Message Authentication

Challenge: Implement Timing Attack Protection

Contribution Ideas

Challenge Yourself Further

Build Something New

Study Real Implementations

Write About It

Getting Help

Challenge Completion