A Rust fork of preload, the original C daemon by Behdad Esfahbod, substantially modernised beyond a faithful port.
Each cycle, neopreload scans /proc for running processes and the files they have mapped. It maintains:
- an exe model — each tracked executable accumulates a cumulative runtime and a set of memory-mapped file regions;
- a Markov model — a 4-state continuous-time Markov chain per exe pair, tracking co-occurrence and transition probability between running states;
- a map arena — reference-counted file regions (path + offset + length) shared across exes.
At prediction time, Markov edges from currently-running exes cast probability bids onto not-yet-running exes. Those exes then bid their mapped regions into a readahead queue, sorted by score. The queue is consumed up to a memory budget derived from MemAvailable and total RAM, scaled by a swap pressure factor.
Readahead is submitted via posix_fadvise(WILLNEED). On rotational storage, regions are sorted by physical offset using FIEMAP before submission to reduce seek time. On SSDs and NVMe, sorting is skipped. Storage type is detected automatically from /sys/block at startup.
File deletions and replacements are tracked via inotify and purged from the model immediately.
Algorithm improvements
| Feature | Detail |
|---|---|
| MemAvailable budget | Uses kernel's MemAvailable instead of MemFree + Cached to avoid double-counting reclaimable memory |
| Swap-factor scaling | Budget multiplied by sqrt(SwapFree / SwapTotal); eviction kicks in when swap pressure is high |
| Writeback guard | Readahead skipped entirely when dirty writeback exceeds 2% of RAM |
| POSIX_FADV_DONTNEED eviction | Unpredicted maps are actively evicted under memory pressure |
| FIEMAP physical block sort | On rotational storage, requests are sorted by physical disk block to minimise seek distance; skipped on SSD/NVMe |
mincore residency check |
Checks whether all pages are already resident before issuing fadvise(WILLNEED); skips the syscall if so |
| Hit-rate logging | Each cycle logs regions already resident vs submitted — a direct measure of prediction quality |
IOPRIO_CLASS_IDLE |
Sets I/O scheduling class to IDLE at startup so prefetch I/O never competes with foreground work |
| Exponential weight decay | Markov transition weights decay by 0.999 on each state change (~4 hour half-life at default 20 s cycle) |
| Running-exe exclusion | Running-exe maps are excluded from the prefetch plan; their pages are already loaded |
| Base probability floor | Exes with no Markov edges are scored by historical runtime share so isolated programs remain prefetch candidates |
| Sparse-correlation fix | correlation() = 0.0 (insufficient data) is replaced with f64::MIN_POSITIVE to avoid zeroing newly-observed exe scores |
| Active-set Markov window | Edges are only created to exes seen within the last 6 hours, bounding graph growth to O(N×K) |
Differential /proc scan |
get_maps() is called only for new or exec'd PIDs; unchanged long-lived processes are skipped |
| Startup stale-exe purge | On first run and hourly thereafter, tracked exes whose paths no longer exist are removed with full Markov/map cascade |
| Stale-map purge on load | After loading the state file, maps rejected by the current mapprefix policy are immediately discarded |
| inotify deletion tracking | Watches parent directories of tracked files; removes exes from the model when deleted or moved |
cargo build --releaseArch Linux
A PKGBUILD is available.
Configuration (optional)
Memory budget
Each cycle, the prefetch budget is:
budget = max(0, total × memtotal% + available × memfree%) × swap_factor
where swap_factor = sqrt(swap_free / swap_total), or 1.0 if no swap is present. Readahead is skipped entirely when writeback exceeds 2% of total RAM.
main.rs Entry point, async main loop, signal handling
cmdline.rs CLI argument parsing (clap)
conf.rs Configuration loading (config + serde)
log_setup.rs Logging initialisation (file logger + env_logger for stderr)
state.rs Persistent + runtime data structures (arena graph, FxHashMap)
state_io.rs State file load/save (tab-delimited)
proc_scan.rs /proc scanning: processes, maps, memory stats
spy.rs Data acquisition: scan() + update_model(), differential scanning
prophet.rs Markov prediction engine, budget formula, readahead dispatch
fadvise.rs posix_fadvise, FIEMAP, mincore, ioprio_set, eviction
watcher.rs inotify file-deletion tracker