Skip to content

hybrid-jit-new-decoder-jit-stats#229

Open
mad0x60 wants to merge 10 commits intotum-ei-eda:masterfrom
mad0x60:map-based-decoder-m1
Open

hybrid-jit-new-decoder-jit-stats#229
mad0x60 wants to merge 10 commits intotum-ei-eda:masterfrom
mad0x60:map-based-decoder-m1

Conversation

@mad0x60
Copy link

@mad0x60 mad0x60 commented Jan 2, 2026

Hybrid JIT
ETISS LUT-based decoder
JITStats Plugin
MacOS compatibility

mad0x60 and others added 10 commits December 4, 2025 18:16
This update adds native macOS support for building and running ETISS (tested on Apple Silicon). It resolves several macOS-specific issues related to RPATH handling, TCC-based JIT compilation, and third-party library compatibility.

Linux-style `$ORIGIN` RPATHs have been replaced with platform-appropriate paths:

- macOS: `@loader_path` for correct dynamic loader resolution
- Linux: `$ORIGIN` (kept for backward compatibility)

updates:
- CMakeLists.txt – main, platform-aware RPATH configuration
- src/bare_etiss_processor/CMakeLists.txt – binary RPATH setup for library discovery
- ArchImpl/RV32IMACFD/CMakeLists.txt – plugin RPATH for `libresources.dylib`
- ArchImpl/RV64IMACFD/CMakeLists.txt – plugin RPATH for `libresources.dylib`

A build issue with the TCC JIT on macOS was fixed in `include_c/etiss/jit/types.h`.
Although TCC defines `__GNUC__`, it does not support 128-bit integers, which caused JIT failures.

The preprocessor check was updated from:

to:

This properly excludes `__int128_t` when compiling with TCC.

VERSION file of TCC conflicts with <version> in std lib header.
which creates issues on macos as a case insensitive system.
workaround is created in etiss/third_party/CMakeLists.txt to address that

- src/SimpleMemSystem.cpp: Added an explicit type cast for the random generator seed to satisfy Clang’s stricter narrowing checks
- ArchImpl/RV32IMACFD/RV32IMACFDFuncs.h: Removed a conflicting `wait()` declaration
- ArchImpl/RV64IMACFD/RV64IMACFDFuncs.h: Removed a conflicting `wait()` declaration

Tested on macOS 15.3.2+ (Apple Silicon) with:

- Toolchain: riscv-gnu-toolchain via homebrew-riscv
- Compiler: Apple Clang 17.0.0
- Build type: Release with C++17
- JIT engines: TCC, GCC, LLVM (all functional)

The hello_world and dhrystone benchmarks now run successfully without any manual RPATH modifications.
Introduce a hybrid JIT compilation strategy that combines a fast baseline
compiler (TCC) with an optimizing compiler (GCC/LLVM) to achieve both low
startup latency and high steady-state performance.

The implementation consists of two tiers running concurrently:

**Tier 0 - Baseline (Main Thread):**
- Uses TCC (or other fast JIT) for immediate compilation of new basic blocks
- Never blocks on optimizing compilation
- Provides low-latency execution for newly encountered code paths

**Tier 1 - Optimized (Background Workers):**
- Pool of 3 worker threads process an optimization queue
- Uses GCC/LLVM to recompile blocks with full optimizations
- Updates are applied transparently on subsequent block lookups

- `jit.fast_type`: Specifies the fast JIT compiler (e.g., "TCCJIT")
- `jit.type`: Now serves as the optimizing JIT (e.g., "LLVMJIT", "GCCJIT")

- Added `fastExecBlock`/`fastJitLib` for baseline compiled version
- Added `optimizedExecBlock`/`optimizedJitLib` for optimized version
- Added `hasOptimized` flag to track optimization status

- Manages thread pool and task queue for background compilation

1. On cache miss: compile with fast JIT, queue for optimization
2. On cache hit: check for optimized version and swap if available
3. just for error handling: we will fallback to main JIT if the
  fast JIT unavailable or fails
LLVM JIT is not thread-safe for concurrent translate() calls, which causes
issues when multiple worker threads try to compile blocks in parallel.

This fix creates a separate JIT instance for each worker thread, enabling
safe parallel compilation in the OptimizationManager.

Changes:
- Replace static NUM_THREADS with configurable numThreads_ parameter
- Add threadJits_ vector to store per-thread JIT instances
- Create per-thread JIT instances in OptimizationManager constructor
- Use thread-specific JIT instance in optimizationWorker()
- Add jit.optimization_threads config option (default: 1)

The per-thread JIT instances are created by calling etiss::getJIT() with
the same JIT name, ensuring each thread has its own isolated context for
compilation while maintaining the same optimization settings.

Upstream reference: tum-ei-eda/etiss@fc7f1d3
Add a comprehensive statistics plugin for monitoring JIT compilation
and execution performance:

Statistics collected:
- Compilation counts (fast JIT vs optimizing JIT)
- Background optimization progress (blocks optimized, switched)
- Cache performance (sequential hits, branch hits, misses)
- Execution tracking (total, fast JIT, optimized versions)
- Timing breakdown (compilation time, execution time)
- Performance metrics (CPU time, MIPS estimated/corrected)

Implementation details:
- JITStatsCollector plugin (CoroutinePlugin) for display
- Centralized stats storage in JITStats.cpp
- Conditional compilation via ETISS_TRANSLATOR_STAT flag
- Runtime enable/disable via jit.stats config option
- Thread-safe accumulation for background compilation times
- Integration with CPUCore for MIPS metrics

All statistics code is wrapped in ETISS_TRANSLATOR_STAT to ensure
zero overhead when compiled without the flag.
1. fixed an issue where translaiton_time was reported as 0.
added translationTime_us_ in Translation.h to correctly capture
translaiton duration.
2. added systemTime_ns to JITTranslationStats to measure
the duration of system->syncTime calls.
3. Added blockLookupTime_ns to track time spent searching the block map
in Translation::getBlock. This covers both cache hits and misses.
4. jitStatsCollector to include system_time, bloock_lookup_time and
their percentages in Json and console output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants