Skip to content

SenseiDeElite/neopreload

Repository files navigation

neopreload

License: GPL v2

A Rust fork of preload, the original C daemon by Behdad Esfahbod, substantially modernised beyond a faithful port.


How it works

Each cycle, neopreload scans /proc for running processes and the files they have mapped. It maintains:

  • an exe model — each tracked executable accumulates a cumulative runtime and a set of memory-mapped file regions;
  • a Markov model — a 4-state continuous-time Markov chain per exe pair, tracking co-occurrence and transition probability between running states;
  • a map arena — reference-counted file regions (path + offset + length) shared across exes.

At prediction time, Markov edges from currently-running exes cast probability bids onto not-yet-running exes. Those exes then bid their mapped regions into a readahead queue, sorted by score. The queue is consumed up to a memory budget derived from MemAvailable and total RAM, scaled by a swap pressure factor.

Readahead is submitted via posix_fadvise(WILLNEED). On rotational storage, regions are sorted by physical offset using FIEMAP before submission to reduce seek time. On SSDs and NVMe, sorting is skipped. Storage type is detected automatically from /sys/block at startup.

File deletions and replacements are tracked via inotify and purged from the model immediately.


Changes from the original preload

Algorithm improvements

Feature Detail
MemAvailable budget Uses kernel's MemAvailable instead of MemFree + Cached to avoid double-counting reclaimable memory
Swap-factor scaling Budget multiplied by sqrt(SwapFree / SwapTotal); eviction kicks in when swap pressure is high
Writeback guard Readahead skipped entirely when dirty writeback exceeds 2% of RAM
POSIX_FADV_DONTNEED eviction Unpredicted maps are actively evicted under memory pressure
FIEMAP physical block sort On rotational storage, requests are sorted by physical disk block to minimise seek distance; skipped on SSD/NVMe
mincore residency check Checks whether all pages are already resident before issuing fadvise(WILLNEED); skips the syscall if so
Hit-rate logging Each cycle logs regions already resident vs submitted — a direct measure of prediction quality
IOPRIO_CLASS_IDLE Sets I/O scheduling class to IDLE at startup so prefetch I/O never competes with foreground work
Exponential weight decay Markov transition weights decay by 0.999 on each state change (~4 hour half-life at default 20 s cycle)
Running-exe exclusion Running-exe maps are excluded from the prefetch plan; their pages are already loaded
Base probability floor Exes with no Markov edges are scored by historical runtime share so isolated programs remain prefetch candidates
Sparse-correlation fix correlation() = 0.0 (insufficient data) is replaced with f64::MIN_POSITIVE to avoid zeroing newly-observed exe scores
Active-set Markov window Edges are only created to exes seen within the last 6 hours, bounding graph growth to O(N×K)
Differential /proc scan get_maps() is called only for new or exec'd PIDs; unchanged long-lived processes are skipped
Startup stale-exe purge On first run and hourly thereafter, tracked exes whose paths no longer exist are removed with full Markov/map cascade
Stale-map purge on load After loading the state file, maps rejected by the current mapprefix policy are immediately discarded
inotify deletion tracking Watches parent directories of tracked files; removes exes from the model when deleted or moved

Building

cargo build --release

Installation

Arch Linux

A PKGBUILD is available.

Configuration (optional)

See neopreload.toml.example.


Memory budget

Each cycle, the prefetch budget is:

budget = max(0, total × memtotal% + available × memfree%) × swap_factor

where swap_factor = sqrt(swap_free / swap_total), or 1.0 if no swap is present. Readahead is skipped entirely when writeback exceeds 2% of total RAM.


Architecture

main.rs        Entry point, async main loop, signal handling
cmdline.rs     CLI argument parsing (clap)
conf.rs        Configuration loading (config + serde)
log_setup.rs   Logging initialisation (file logger + env_logger for stderr)
state.rs       Persistent + runtime data structures (arena graph, FxHashMap)
state_io.rs    State file load/save (tab-delimited)
proc_scan.rs   /proc scanning: processes, maps, memory stats
spy.rs         Data acquisition: scan() + update_model(), differential scanning
prophet.rs     Markov prediction engine, budget formula, readahead dispatch
fadvise.rs     posix_fadvise, FIEMAP, mincore, ioprio_set, eviction
watcher.rs     inotify file-deletion tracker

About

An adaptive readahead daemon.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors