Add HIP Backend by Nintorac · Pull Request #166 · Ai00-X/ai00_server

Nintorac · 2026-02-04T03:49:55Z

Hey,

Some initial work to support hip backend. I am planning on creating a new repo hip-rwkv that will actually implement the model code. It's heavily based on your web-rwkv thanks for the kick off point!

The code for hip backend currently lives here - Nintorac/web-rwkv#1

Just posting this here as a preliminary idea, is this something you're interested in merging? or prefer I just fork and maintain myself?

- Workspace root: add hip-rwkv workspace dep (path ../../hip-rwkv), update web-rwkv to 0.10.19, add [patch.crates-io] web-rwkv path to ensure single source of truth - ai00-core: add hip feature gating dep:hip-rwkv (optional) - ai00-server: forward hip feature to ai00-core/hip

Add Backend enum (WebGpu/Hip) to reload.rs with Default impl defaulting to WebGpu. Add backend field with #[serde(default)] to Model struct and ReloadRequest, wire through TryFrom<Config>, and add commented-out example to Config.toml. Existing config files without a backend field continue to work via serde default.

Implements web_rwkv::runtime::model::State for the HIP backend, bridging ai00's TensorCpu<f32> state format with hip-rwkv's native HipState format. Includes state format conversion between the v7 WebGPU layout [n_embd, head_size+2, n_layer, 1] and HipState's per-layer PinnedBuffer components with f32<->f16 conversion for shift states. The att/ffn/write/read methods return errors since HIP doesn't support per-layer GPU state manipulation.

Wire up the HIP backend loading in ai00-core: - Add load_runtime_hip() (cfg-gated) that validates V7 model, loads via Rwkv7Hip::load on spawn_blocking, creates HipRuntime and HipStateAdapter - Add Backend dispatch in ThreadRequest::Reload handler: WebGpu takes the existing path, Hip calls load_runtime_hip() - Introduce SoftmaxBackend enum in run.rs to decouple softmax computation from wgpu Context (WebGpu variant uses wgpu, Hip variant uses hip_rwkv::softmax_hip_batch) - Change run() to accept SoftmaxBackend instead of Context - Make CoreRuntime.context optional (None for HIP backend) - Add HipModelStub for ModelSerialize (HIP doesn't support save) - Fix pre-existing enumerate_adapters async API mismatch

- Extend list_adapters() with HIP device enumeration via hip_rwkv::hip::get_device_count/get_device_name behind #[cfg(feature = "hip")] - Add pub hip_to_model_info() converting Rwkv7ModelInfo + LoraDims to ModelInfo with ModelCustomInfo::V7 populated from LoRA dimensions - Make Environment::Loaded.model Option<Arc<dyn ModelSerialize>> so the HIP backend (which cannot serialize) passes None instead of a stub - Update Save handler to gracefully return false when model is None - Remove HipModelStub (no longer needed) - Update load_runtime_hip return type to exclude model component

- Panic at startup if config requests backend = "Hip" but the binary was compiled without --features hip, instead of silently failing in a fire-and-forget background task. - Log progress through the reload path (env lock, tokenizer, backend dispatch, model load) so hangs are diagnosable. - Monitor the fire-and-forget initial load task and log errors/panics instead of silently dropping them. - Use per-batch state methods (load_state_batch/get_state_batch) in HipStateAdapter so save/restore targets a single slot.

- Relax fastembed version constraint from =4.4.0 to 4 to fix ort compilation errors (ort v2.0.0-rc.9 API incompatibilities) - Add BGELargeZHV15 and ModernBertEmbedLarge variants to EmbeddingModel enum to match fastembed 4.9.1 - Apply cargo fmt formatting fixes

- Add tests/smoke.rs with ignored smoke tests that spawn the server as a subprocess and verify completion responses - Add assets/models symlink to /workspace/models for test models - Update configs to use assets/models path and consistent model name - Add reqwest dev-dependency for HTTP client in tests Run with: cargo test --features hip smoke_hip -- --ignored

Nintorac · 2026-02-06T05:57:01Z

AI summary of the changes:

Summary

HIP backend integration: Uses hip-rwkv for RWKV
v7 inference via rocBLAS GEMM and custom HIP kernels
Backend selection: New backend config option ("WebGpu" or "Hip") with WebGpu as d
efault
Feature-gated: Build with --features hip to include HIP support
Documentation: Added HIP setup guide to README with prerequisites, build instructions,
and sample config
Smoke tests: Subprocess-based integration tests for both WebGPU and HIP backends

Changes

Area	Files
Core	`ai00-core/src/lib.rs`, `hip_state.rs`, `run.rs`
Config	`ai00-server/src/config.rs`, `Config.hip.toml`
Dependencies	`Cargo.toml` (hip-rwkv from upstream)
Tests	`tests/smoke.rs` with `smoke_webgpu` and `smoke_hip`
Docs	`README.md` HIP section

Test Plan

# Build with HIP
cargo build --release --features hip

# Run smoke tests (requires GPU + model)
cargo test --features hip smoke_hip -- --ignored
cargo test smoke_webgpu -- --ignored

Notes

HIP backend currently supports RWKV v7 models only

Nintorac added 6 commits February 3, 2026 10:32

Nintorac force-pushed the hip branch from d77d581 to 78ee9e8 Compare February 4, 2026 03:54

Nintorac added 4 commits February 4, 2026 04:57

Add HIP backend docs and clean up unused imports

487c5c4

Use upstream hip-rwkv git for hip/web-rwkv deps

b0b4694

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HIP Backend#166

Add HIP Backend#166
Nintorac wants to merge 10 commits intoAi00-X:mainfrom
Nintorac:hip

Nintorac commented Feb 4, 2026

Uh oh!

Nintorac commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nintorac commented Feb 4, 2026

Uh oh!

Nintorac commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test Plan

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Nintorac commented Feb 6, 2026 •

edited

Loading