Skip to content

HazyResearch/intelligence-per-watt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intelligence Per Watt

Benchmarking intelligence efficiency for LLM inference systems.

arXiv Project Docs Python License


Intelligence Per Watt measures accuracy alongside energy for any LLM inference system. It profiles single-turn and multi-turn agentic workloads, captures per-query energy telemetry, and computes two efficiency metrics: Intelligence Per Joule (IPJ) and Intelligence Per Watt (IPW).

Documentation

Project Site

Prerequisites

  • Python >= 3.13 -- managed with uv
  • Rust compiler -- for the energy monitor (install)
  • protoc -- Protocol Buffer compiler (install)
  • An inference runtime -- Ollama, vLLM, or an OpenAI-compatible API

See Prerequisites for platform-specific setup (NVIDIA NVML, AMD ROCm, Apple Silicon, Linux RAPL).

Installation

pip install intelligence-per-watt

Or from source:

git clone https://github.com/HazyResearch/intelligence-per-watt.git
cd intelligence-per-watt
uv venv && source .venv/bin/activate
uv run scripts/build_energy_monitor.py    # Build Rust energy monitor
uv pip install -e intelligence-per-watt

There is also an automated setup script that handles virtual environment creation, package installation, and energy monitor build:

bash intelligence-per-watt/scripts/setup.sh

Optional extras: ollama, vllm, react, openhands, terminus, agents, tavily, flops, all.

Verify Installation

# Run the test suite
pytest intelligence-per-watt

# Check the CLI
ipw --help

# Test energy monitoring on your hardware
uv run scripts/test_energy_monitor.py

Quick Start

Profile an inference server:

ipw profile --client ollama --model llama3.2:1b --client-base-url http://localhost:11434

Run an agentic benchmark:

ipw run --agent react --model gpt-4o --dataset gaia --max-queries 10

Analyze and plot results:

ipw analyze ./runs/profile_*
ipw plot ./runs/profile_*

Each query captures: energy (Joules), power (Watts), GPU/CPU memory, temperature, TTFT, throughput, token counts, API cost, and FLOPs.

What's Included

Inference clients -- Ollama, vLLM (offline), OpenAI-compatible servers

Agent harnesses -- ReAct (Agno), OpenHands, Terminus

Benchmarks -- MMLU-Pro, SuperGPQA, GAIA, FRAMES, HLE, SimpleQA, SWE-bench, SWEfficiency, TerminalBench, and a built-in 1K mixed set

Energy telemetry -- Rust gRPC service (50ms sampling) with NVIDIA NVML, AMD ROCm, Apple Silicon powermetrics, and Linux RAPL collectors

Evaluation -- LLM-as-judge, MCQ exact match, and task-specific scorers

Architecture

ipw/
├── cli/          CLI commands (profile, run, analyze, plot, list)
├── clients/      Inference adapters (Ollama, vLLM, OpenAI)
├── agents/       Agent harnesses with per-turn telemetry
├── datasets/     Dataset providers (10+ benchmarks)
├── evaluation/   Scoring handlers
├── analysis/     IPJ/IPW computation, regression fitting
├── execution/    ProfilerRunner, AgenticRunner, TelemetrySession
└── telemetry/    Energy monitor launcher + gRPC collector

energy-monitor/   Rust gRPC service with platform-specific collectors

All components self-register via the registry pattern (@ClientRegistry.register("id"), etc.) and are resolved by string key through the CLI.

About

Intelligence Per Watt is a research initiative studying the efficiency of on-device AI systems. The project is developed at Hazy Research and the Scaling Intelligence Lab at Stanford SAIL.

Sponsors

Laude InstituteStanford MarloweGoogle Cloud PlatformLambda Labs

Citation

If you use Intelligence Per Watt in your research, please cite:

@misc{saadfalcon2025intelligencewattmeasuringintelligence,
      title={Intelligence per Watt: Measuring Intelligence Efficiency of Local AI},
      author={Jon Saad-Falcon and Avanika Narayan and Hakki Orhun Akengin and J. Wes Griffin and Herumb Shandilya and Adrian Gamarra Lafuente and Medhya Goel and Rebecca Joseph and Shlok Natarajan and Etash Kumar Guha and Shang Zhu and Ben Athiwaratkun and John Hennessy and Azalia Mirhoseini and Christopher Ré},
      year={2025},
      eprint={2511.07885},
      archivePrefix={arXiv},
      primaryClass={cs.DC},
      url={https://arxiv.org/abs/2511.07885},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors