Extension challenges organized by difficulty. Each builds on the existing codebase β you'll work with the same pass-based pipeline, the same type system, and the same patterns established in the implementation.
What to build: Let users upload their own YARA rules alongside the built-in rules. Accept .yar files through a new API endpoint and compile them into the scanner.
Why it's useful: Every SOC team has custom YARA rules for threats specific to their environment. Hardcoded rules can't cover organization-specific indicators.
What you'll learn: YARA rule compilation, file-based configuration, extending the API without changing the engine's core interface.
Hints:
- Look at how
YaraScanner::new()inyara.rscompiles the built-in rules string yara-xsupports compiling multiple rule sources into a single scanner- Store uploaded rules in a directory and recompile the scanner when rules change
- Add a
GET /api/rulesendpoint to list active rules
How to verify it works:
# Create a custom rule
cat > /tmp/custom.yar << 'EOF'
rule detect_hello_world {
strings:
$hello = "Hello, world"
condition:
$hello
}
EOF
# Upload the rule (you'll need to build this endpoint)
curl -X POST -F "rule=@/tmp/custom.yar" http://localhost:3000/api/rules
# Upload a binary containing "Hello, world" and check YARA resultsWhat to build: Add a "Download Report" button to the frontend that generates a PDF summary of the analysis. Include the threat score, key findings from each pass, and MITRE technique mappings.
Why it's useful: Analysts need to share findings with non-technical stakeholders. A PDF report is the standard deliverable in incident response.
What you'll learn: PDF generation in the browser or server-side, data summarization, report formatting.
Hints:
- Client-side: use
jspdfor@react-pdf/rendererto generate PDFs in the browser - Server-side: add a
GET /api/analysis/{slug}/reportendpoint that returnsapplication/pdf - Prioritize the most important findings β a 50-page PDF of every string isn't useful
- Include the ASCII threat score breakdown from the overview tab
How to verify it works:
- Upload a test binary, navigate to the analysis page
- Click "Download Report" and open the resulting PDF
- Verify it includes: file metadata, threat score, top scoring categories, MITRE techniques
What to build: When a non-executable file is uploaded (PDF, ZIP, Office document), return a helpful error message that identifies the file type and suggests what to do instead of a generic "unsupported format" error.
Why it's useful: Users will upload all sorts of files. Telling them "this is a PDF, not an executable β try extracting embedded objects first" is more helpful than "parse error."
What you'll learn: Magic byte detection for common file types, user-facing error design.
Hints:
- Common magic bytes: PDF (
%PDF), ZIP/DOCX/XLSX (PK\x03\x04), GZIP (\x1f\x8b), Java class (\xca\xfe\xba\xbe) - Add detection in
formats::parse_format()before thegoblinparse attempt - Return a structured error with the detected file type and guidance
- Office documents (DOCX, XLSX) are ZIP archives β mention that macros can be extracted as separate analysis targets
How to verify it works:
# Upload a PDF
curl -X PUT -F "file=@test.pdf" http://localhost:3000/api/upload
# Should return: { "error": "Detected PDF file. Binary analysis requires an executable (ELF, PE, or Mach-O)." }What to build: Extend the disassembly pass to support ARM and AArch64 architectures. Currently, non-x86 binaries get an empty disassembly result.
Why it's useful: ARM binaries are everywhere β Android apps (native libraries), IoT firmware, macOS Apple Silicon binaries. Skipping disassembly for ARM misses a huge class of targets.
What you'll learn: ARM instruction set basics, multi-architecture disassembly, extending the pass system.
Implementation approach:
- Add the
capstonecrate (orbad64for AArch64) toaxumortem-engine - Modify
DisasmPass::run()to dispatch to an ARM disassembler when the architecture is ARM/AArch64 - The basic block and CFG construction logic should be reusable β ARM has the same concept of branch instructions creating block boundaries
- ARM has unique considerations: Thumb mode (16-bit instructions mixed with 32-bit), conditional execution on every instruction (ARM32), and the link register instead of stack-based return addresses
Edge cases to test:
- Thumb/ARM mode switching within a single function
- AArch64 binaries from Apple Silicon Macs
- Android NDK compiled shared libraries (
.sofiles)
Hints:
- The
iced-x86crate is x86-only by design. You need a separate decoder for ARM. - ARM branch instructions:
B,BL,BX,BLX,CBZ,CBNZ,TBB,TBH - AArch64 branch instructions:
B,BL,BR,BLR,RET,CBZ,CBNZ,TBZ,TBNZ
What to build: Allow uploading two binaries and comparing their analysis results side-by-side. Highlight differences in sections, imports, strings, and threat scores.
Why it's useful: Diffing is fundamental to malware analysis. Comparing a known-good binary to a suspected-trojanized version reveals exactly what changed. This is how the CCleaner and 3CX supply chain attacks were analyzed β by diffing the legitimate binary against the compromised one.
What you'll learn: Diff algorithms, UI design for comparison views, efficient database queries for multi-binary analysis.
Implementation approach:
- Add a
POST /api/compareendpoint that accepts two slugs and returns a diff - Section-level diffing: compare section names, sizes, permissions, and SHA-256 hashes
- Import diffing: imports added, removed, or changed between versions
- String diffing: new suspicious strings in the modified binary
- Entropy diffing: sections that changed entropy classification
- Frontend: split-pane view with color-coded differences
Edge cases to test:
- Comparing binaries of different formats (ELF vs PE)
- Comparing binaries with different numbers of sections
- Comparing a stripped binary to its non-stripped counterpart
Hints:
- Section SHA-256 hashes make section-level comparison trivial β if the hash matches, the content is identical
- Start with imports and strings β these are the highest-signal diffs for supply chain analysis
- Consider a three-column layout: left binary, diff indicator, right binary
What to build: Create a new analysis pass that identifies high-level behavioral patterns by combining findings from imports, strings, and YARA matches. Instead of "these APIs are suspicious," report "this binary exhibits ransomware behavior" or "this binary implements a reverse shell."
Why it's useful: Individual indicators (an import, a string, a YARA match) are noisy. Behavioral patterns are high-confidence detections that map directly to threat categories.
What you'll learn: Building a new pass, threat intelligence correlation, pattern matching across pass results.
Implementation approach:
- Create
passes/behavior.rsimplementingAnalysisPass - Declare dependencies on
imports,strings,entropy, andthreat - Define behavioral patterns as combinations of indicators:
- Ransomware: encryption APIs + file enumeration + ransom note strings + high entropy sections
- Reverse shell: socket APIs + exec/spawn + shell command strings
- Dropper: high entropy + few imports +
URLDownloadToFileorwget/curlstrings - Keylogger: keyboard hook APIs + file write + persistence mechanisms
- Rootkit:
ptrace+ kernel module strings + hidden file paths
- Add
behavior_resulttoAnalysisContext - Register the pass in
AnalysisEngine::new() - Add a new tab in the frontend
Hints:
- Start with 3-4 patterns and expand β don't try to cover everything
- Each pattern should require at least 3 independent indicators to fire
- False positives are worse than false negatives for behavioral detection
What to build: Detect when a binary resolves imports dynamically at runtime (using GetProcAddress on Windows or dlsym on Linux) by cross-referencing string analysis with import analysis.
Why this is hard: Malware hides its true imports by not listing them in the import table. Instead, it stores API names as strings (sometimes obfuscated) and resolves them at runtime. Detecting this requires correlating string findings with import behavior patterns.
Architecture changes needed:
- Extend
ImportResultwith adynamic_imports: Vec<DynamicImportCandidate>field - The import pass needs access to string results (add "strings" as a dependency)
- Cross-reference strings that look like Windows API names against the suspicious API database
- Flag the gap: "binary imports
GetProcAddressbut notVirtualAllocEx, yetVirtualAllocExappears as a string"
Implementation phases:
- Research (1-2 hours): Study how
GetProcAddress/dlsymare used in malware. Read about import table reconstruction. - Design (1 hour): Define
DynamicImportCandidatestruct. Decide how to score dynamic imports vs static imports. - Implementation (3-4 hours): Modify the import pass to cross-reference strings. Build the detection logic. Update threat scoring.
- Testing (1-2 hours): Write a test binary that uses
GetProcAddress. Verify detection works.
Gotchas:
- API names in strings might be XOR-encoded or split across multiple strings
GetProcAddressis used legitimately by plugin systems β don't flag it alone- The dependency change (imports now depends on strings) changes the topological order
What to build: When new YARA rules are added, automatically re-scan all previously analyzed binaries against the new rules and update their threat scores.
Why this is hard: This touches storage, background processing, and the caching system. You need to invalidate cached results selectively (only the YARA and threat portions) without re-running expensive passes like disassembly.
Architecture changes needed:
- Store raw binary data (or a reference to it) in addition to analysis results
- Build a background job system that processes re-scans
- Implement partial pass re-execution (re-run only ThreatPass with new YARA rules)
- Update the database schema to track which rule version produced each result
- Add a
POST /api/retrohuntendpoint that triggers re-scanning
Implementation phases:
- Research (2-3 hours): Study how VirusTotal and other platforms handle retrohunting. Understand the scale challenges.
- Design (2 hours): Design the job queue, rule versioning scheme, and partial re-execution strategy.
- Implementation (5-6 hours): Build the retrohunt system. This is a significant addition β take it in stages.
- Testing (2-3 hours): Upload multiple binaries, add a new rule, trigger retrohunt, verify scores update.
Gotchas:
- Binary storage increases disk usage significantly β consider optional storage vs re-upload
- Retrohunt on thousands of binaries needs a progress indicator and cancellation support
- Updating threat scores changes risk levels, which might trigger alerts in downstream systems
What to build: For binaries detected as UPX-packed, automatically unpack them and re-analyze the unpacked binary. Display both the packed and unpacked analysis results.
Why this is hard: Unpacking modifies the binary. You need to handle the unpacking process safely, store both versions, and present a meaningful comparison. UPX is the easiest target because the upx tool can unpack most UPX binaries, but other packers require custom unpackers.
Architecture changes needed:
- After the entropy pass detects UPX packing, invoke the
upxcommand-line tool with--decompress - Run the full analysis pipeline on the unpacked binary
- Store both analysis results linked by a parent-child relationship
- Frontend: show "packed" and "unpacked" tabs with comparison
Implementation phases:
- Research (1-2 hours): Study UPX internals and the
upx --decompresscommand. Understand failure modes. - Design (1 hour): Design the parent-child schema, decide where in the pipeline unpacking happens.
- Implementation (4-5 hours): Add the unpacking step, second analysis pass, database schema changes, and frontend updates.
- Testing (2-3 hours): Pack binaries with UPX, upload them, verify unpacking and re-analysis.
Gotchas:
- Modified UPX (where section names are changed from UPX0/UPX1) breaks
upx --decompress - The unpacking process must run in a sandboxed environment (it's executing a tool on untrusted input)
- Not all UPX-packed binaries can be unpacked β handle failures gracefully
- Consider adding
upxto the Docker image
What to build: Scale AXUMORTEM horizontally by distributing analysis work across multiple backend instances with a shared job queue.
Estimated time: 15-25 hours
Prerequisites: Understanding of message queues, distributed systems concepts, container orchestration.
High-level architecture:
βββββββββββββββββββ
β API Gateway β
β (Nginx/LB) β
ββββββββββ¬βββββββββ
β
ββββββββββββββββΌβββββββββββββββ
β β β
ββββββββββvβββββββ ββββvβββββββ ββββvβββββββ
β API Server 1 β β API 2 β β API 3 β
β (web only) β β β β β
βββββββββββ¬βββββββ βββββ¬βββββββ βββββ¬βββββββ
β β β
ββββββββββββββββΌβββββββββββββββ
β
βββββββββvβββββββββ
β Job Queue β
β (Redis/RabbitMQ)β
ββββββββββ¬βββββββββ
β
ββββββββββββββββΌβββββββββββββββ
β β β
ββββββββββvβββββββ ββββvβββββββ ββββvβββββββ
β Worker 1 β β Worker 2 β β Worker 3 β
β (engine only) β β β β β
βββββββββββ¬βββββββ βββββ¬βββββββ βββββ¬βββββββ
β β β
ββββββββββββββββΌβββββββββββββββ
β
βββββββββvβββββββββ
β PostgreSQL β
β (shared) β
βββββββββββββββββββ
Implementation phases:
-
Job queue setup (3-4 hours): Add Redis or RabbitMQ. Modify the upload route to enqueue analysis jobs instead of running them inline. Add a status endpoint for pending jobs.
-
Worker process (4-5 hours): Create a separate binary (
axumortem-worker) that pulls jobs from the queue, runs the engine, and writes results to PostgreSQL. The worker reuses theaxumortem-enginecrate directly. -
Result polling (2-3 hours): The API server returns "pending" status for in-progress analyses. Add WebSocket or SSE support for real-time progress updates. The frontend polls or subscribes for completion.
-
Scaling and reliability (4-6 hours): Handle worker failures (job timeouts, dead letter queues). Add health monitoring for workers. Implement job priority (small binaries first). Add Kubernetes manifests or Docker Swarm configuration.
Testing strategy:
- Unit tests: job serialization/deserialization, queue operations
- Integration tests: full upload-analyze-retrieve cycle through the queue
- Load tests: submit 100 binaries concurrently, verify all complete
- Chaos tests: kill a worker mid-analysis, verify the job is retried
Known challenges:
- Binary data transfer through the queue (pass file paths vs embed data)
- Database connection pool sizing for multiple workers
- Graceful shutdown β workers should finish current analysis before stopping
Success criteria:
- Analysis works identically whether run inline or through the queue
- Adding/removing workers doesn't require API server restarts
- Failed analyses are retried with exponential backoff
- Frontend shows real-time progress for queued analyses
- System handles 10x the throughput of a single instance
What to build: Train a classifier on the feature vectors extracted by AXUMORTEM (entropy values, import counts, string statistics, section properties) to predict malware family membership.
Estimated time: 20-30 hours
Prerequisites: Basic ML concepts (feature engineering, train/test split, classification metrics). Python familiarity for the training pipeline.
High-level architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Training Pipeline (Python) β
β β
β MalwareBazaar ββ> Feature Extraction ββ> Model β
β Dataset (via engine API) (ONNX) β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββ
β
export .onnx file
β
βββββββββββββββββββββββββββvβββββββββββββββββββββββ
β Inference (Rust) β
β β
β AnalysisContext ββ> Feature Vector ββ> ONNX ββ>β
β (f32 array) Runtime β
β β β
β Classificationβ
β + Confidence β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation phases:
-
Dataset collection (3-4 hours): Download labeled malware samples from MalwareBazaar or VirusTotal. Analyze each with AXUMORTEM. Export feature vectors (entropy, import count, suspicious API count, string category counts, section count, anomaly count, etc.).
-
Model training (4-6 hours): Build a Python training pipeline. Start with a random forest or gradient boosted tree. Feature engineering is the hard part β decide which analysis outputs become numeric features. Evaluate with precision/recall/F1.
-
Model export (2-3 hours): Export the trained model to ONNX format. Add the
ort(ONNX Runtime) crate to the engine. -
Inference integration (5-8 hours): Create a new
ClassificationPassthat extracts the feature vector fromAnalysisContext, runs ONNX inference, and produces a classification result (malware family + confidence score). Add a frontend tab showing the classification.
Testing strategy:
- Feature extraction: verify the same binary always produces the same feature vector
- Model accuracy: hold-out test set with known labels
- Integration: end-to-end upload and classification
Known challenges:
- Feature normalization β training and inference must use the same scaling
- Class imbalance β benign samples far outnumber malware in most datasets
- Model drift β malware evolves, the model needs periodic retraining
- ONNX Runtime adds ~50MB to the Docker image
Success criteria:
- Model achieves > 85% F1 score on held-out test set
- Classification adds < 100ms to analysis time
- Frontend displays family prediction with confidence percentage
- False positive rate on legitimate system utilities is < 5%
Combine challenges for larger projects:
- Challenges 1 + 8: Custom YARA rules + retrohunting = a mini threat intelligence platform
- Challenges 5 + 9: Binary comparison + UPX unpacking = compare packed vs unpacked views automatically
- Challenges 6 + 11: Behavioral patterns + ML classification = hybrid detection (rules + model)
- Challenges 4 + 7: ARM disassembly + dynamic import detection = full IoT malware analysis
Connect AXUMORTEM to a MISP (Malware Information Sharing Platform) instance. When analysis detects a CRITICAL threat, automatically create a MISP event with indicators of compromise (IOCs): file hashes, C2 URLs, suspicious API chains.
After local analysis, submit the SHA-256 hash to VirusTotal's API and display AV detection ratios alongside AXUMORTEM's own score. Show how AXUMORTEM's scoring compares to the industry consensus.
Create a CLI mode for AXUMORTEM that analyzes binaries produced by a build pipeline and fails the build if the threat score exceeds a threshold. Useful for catching supply chain compromises in compiled artifacts.
Profile the current analysis pipeline. Identify which passes are slowest for large binaries. Optimize the critical path:
- Can entropy calculation use SIMD instructions?
- Can string extraction run in parallel across sections?
- Can the disassembler use multiple threads for independent functions?
Large binaries currently load entirely into memory (Arc<[u8]>). Implement streaming analysis where possible:
- Entropy calculation can work with memory-mapped files
- String extraction can process chunks
- Section hashing already works on slices
Profile with heaptrack or dhat to find allocation hotspots.
Implement Authenticode signature verification for PE binaries and codesign verification for Mach-O binaries. Display whether the signature is valid, expired, or from a known-compromised certificate (like the stolen Realtek certificate used by Stuxnet).
Create a pass that specifically targets sandbox evasion techniques:
- Timing checks (
GetTickCountdeltas,rdtscloops) - Environment checks (VM artifacts, debugger presence, process list)
- Resource checks (low RAM, single CPU = likely sandbox)
- User interaction checks (no mouse movement, no recent documents)
This is a standalone pass with its own scoring category.
Add ssdeep fuzzy hashing to identify binaries that are similar but not identical. Two variants of the same malware family will have different SHA-256 hashes but similar ssdeep hashes. Display a "similar binaries" section that finds fuzzy matches in the database.
If you build something you're proud of:
- Fork the repo, create a feature branch
- Follow the existing code patterns (sealed traits, typed enums, no magic numbers)
- Add tests for your new pass or feature
- Update the learn documentation if you add significant functionality
- Open a PR with before/after screenshots or test output
Track your progress:
- Easy 1: Custom YARA rule loading
- Easy 2: PDF report export
- Easy 3: File type detection
- Intermediate 4: ARM/AArch64 disassembly
- Intermediate 5: Binary comparison view
- Intermediate 6: Behavioral pattern detection
- Advanced 7: Dynamic import resolution
- Advanced 8: Retrohunt system
- Advanced 9: Packer unpacking
- Expert 10: Distributed analysis cluster
- Expert 11: ML-based classification