Skip to content

test: energy audit demo trigger CI workflow#6

Closed
hongping-zh wants to merge 1 commit intomainfrom
test/energy-audit-demo
Closed

test: energy audit demo trigger CI workflow#6
hongping-zh wants to merge 1 commit intomainfrom
test/energy-audit-demo

Conversation

@hongping-zh
Copy link
Copy Markdown
Owner

This PR adds a test Python file with intentional energy waste patterns to verify the EcoCompute Energy Audit GitHub Action.

Expected results:

  • Critical: Default INT8 without threshold fix
  • Warning: BS=1 loop pattern

@github-actions
Copy link
Copy Markdown

⚡ EcoCompute Energy Audit

🖥️ Hardware Environment

⚠️ No GPU detected. Energy measurement requires NVIDIA GPU. This run uses static analysis + estimation only.
Scanned 1 Python file(s). Found 1 critical, 1 warning(s).

🔴 Critical Issues

Default INT8 (bitsandbytes mixed-precision decomposition)test_energy.py (line 9)

load_in_8bit=True without llm_int8_threshold=0.0 causes 17–147% energy waste due to INT8↔FP16 type conversion at every linear layer. Measured on RTX 4090D (+32.7%) and A800 (+122–147%).

Energy impact: +17–147% energy vs FP16

Fix: Add llm_int8_threshold=0.0 to your BitsAndBytesConfig:

config = BitsAndBytesConfig(
    load_in_8bit=True,
    llm_int8_threshold=0.0,  # Disables mixed-precision decomposition
)

🟡 Warnings

Sequential single-request processing (BS=1)test_energy.py (line 19)

Processing prompts in a loop wastes up to 95.7% energy vs batched inference. Measured on A800: BS=1 → 1,768 J/request, BS=64 → 76 J/request. GPU utilization at BS=1 is only 45%.

Energy impact: Up to 95.7% energy waste vs batched

Fix: Batch your inputs or use a serving framework:

# Option 1: Batch with tokenizer
inputs = tokenizer(prompts, padding=True, return_tensors='pt').to('cuda')
outputs = model.generate(**inputs)

# Option 2: Use vLLM for production
from vllm import LLM
llm = LLM(model=model_name)
outputs = llm.generate(prompts)

📈 Relative Change (vs Baseline)

No baseline found. This run will be saved as the new baseline.


📊 Based on 93+ measurements across RTX 4090D / A800 / RTX 5090 · Full data · Install Bot · OpenClaw Skill

@hongping-zh
Copy link
Copy Markdown
Owner Author

Test completed successfully. All detection rules working as expected. Closing test PR.

@hongping-zh hongping-zh deleted the test/energy-audit-demo branch February 20, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant