Skip to content

LTO-safe linker stubs#1090

Draft
greenhat wants to merge 9 commits intonextfrom
i1089-lto-codex
Draft

LTO-safe linker stubs#1090
greenhat wants to merge 9 commits intonextfrom
i1089-lto-codex

Conversation

@greenhat
Copy link
Copy Markdown
Contributor

@greenhat greenhat commented Apr 28, 2026

Close #1089

This PR:

  • Enables the fat LTO (lto = true).
  • Revamps the stubs to use black_box to survive LTO.

Mast size changes (stripped debug info)

Example next PR Change
basic-wallet 29016 15764 -45.67%
basic-wallet-tx-script 45884 17045 -62.85%
p2id-note 38430 19745 -48.62%
p2ide-note 43003 23039 -46.42%

Basic wallet cycle changes

Measurement next PR Change
p2id mint note 20211 7327 -63.75%
p2id tx script processing 25232 6799 -73.05%
p2id note 20211 7327 -63.75%
p2ide reclaim note 21699 8340 -61.57%

TODO:

  • remove no longer used stub_arg functions

Miden VM intrinsic signatures were still modeled outside the conversion result, even though the selected VM operation is the value that determines the expected type. That split made the validation path harder to follow and kept linker stubs responsible for metadata they should not own.

Store the expected function type directly in the MidenVmOp conversion result and thread it through the debug and felt intrinsic conversions. The module builder now validates against the operation metadata, while linker stubs stay focused on declaring the imported module functions.
The LTO-safe stub support was copied into base-sys and both stdlib stub roots, so future changes to define_stub or the opaque argument sinks would need to stay synchronized in three places. The duplication also made review of the linker-stub behavior noisier than necessary.

Extract the shared return-value, argument-sink, and define_stub machinery into sdk/linker_stub.rs and include it from each stub root. The base and stdlib build scripts now track the shared file so the generated stub archives rebuild whenever the common support changes.
#[inline(never)]
pub extern "C" fn __intrinsics_mem_heap_base_stub() -> *mut u8 {
unsafe { core::hint::unreachable_unchecked() }
stub()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two things we could try rather than these stubs:

  1. Declare extern functions as having extern_weak linkage. According to the wasm-ld docs, the linker will emit stubs that contain an unreachable in such cases. AIUI, LTO would not apply in these cases, but testing would be required:
#[unsafe(export_name = "...")]
#[unsafe(linkage = "extern_weak")]
extern "C" fn foo() -> result;
  1. Use stub libraries. This is used directly with the linker, rather than relying on rustc/LLVM to do things correctly. This would require writing the stubs file to disk, and passing custom linker arguments.

I'm not sure if there are any lld options that control specific optimizations that apply here, but digging deeper there might be worth it as well, rather than relying on implementation details of rustc/LLVM to obscure the code enough to disable link-time optimizations (e.g. black_box). We'd definitely need really good regression tests to catch if something slips through that subtly breaks things if we do end up relying on that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It ends up as an env imports. The link command contains --allow-undefined. That means the unresolved weak externs become imports rather than local unreachable trap stubs. I also tried adding --unresolved-symbols=ignore-all explicitly through cargo-miden, but that did not override the import behavior.

  2. I tried it when I did Explore the alternatives(removing) to explicitly passing low-level Miden SDK WIT files #532 (comment). I tried it again and the result is the same - we end up with env imports in the Wasm module that stop the Wasm CM linking.

As long as there is --allow-undefined as I explained in #532 (comment) these approaches will not bring the desired outcome.

Copy link
Copy Markdown
Contributor Author

@greenhat greenhat Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there are any lld options that control specific optimizations that apply here, but digging deeper there might be worth it as well, rather than relying on implementation details of rustc/LLVM to obscure the code enough to disable link-time optimizations (e.g. black_box).

I think I covered it when I worked on #532.

We'd definitely need really good regression tests to catch if something slips through that subtly breaks things if we do end up relying on that.

The intrinsics and stdlib should be covered by the semantic tests. The protocol bindings are tested only for compilation, but we can add litcheck_filecheck::filecheck! checks for the linked MASM protocol functions.

EDIT: Added in 59dc389

LTO can rewrite Rust linker stubs so their compiled bodies no longer expose the same result arity as the underlying MASM ABI procedure. Using the synthesized HIR function to decide how many call results to retain conflates that generated body with the parsed core Wasm signature.

Pass the parsed stub signature into linker-stub lowering and trim ABI results against that source instead. Function-style intrinsics continue to use the stub signature as their import ABI, while VM-op intrinsics still require an exact signature match before they are inlined.
Manual rustc builds for the SDK linker stub archives still carried file paths through Location::caller(), so LTO-enabled outputs depended on the checkout path and drifted between CI and local machines.

Compile those stub archives with location-detail=line,column so the opaque stubs retain enough call-site variation without embedding absolute source paths. Update the affected MAST size and execution-cycle expectations for the path-free output.
LTO can change or remove Rust stub bodies, so a binding compilation test that only succeeds does not prove the generated MASM still calls the intended protocol procedure.

Add a shared FileCheck helper for the Rust SDK base binding tests and pass each fixture's expected miden::protocol module/function link name through it. This checks the final MASM exec target directly while preserving the existing generated-output coverage.

Verified with cargo make test, cargo make clippy, and cargo make format-rust.
LTO can narrow opaque linker stubs by trimming unused returns, and in the from_u64_unchecked case it can erase the u64 parameter behind the stub_arg helper. Previously recognized intrinsic, stdlib, or Miden ABI stubs could be skipped silently, leaving the frontend to translate opaque stub bodies or declare narrowed imports.

Validate recognized stubs against their canonical signatures, allowing only suffix-trimmed result lists, and use canonical intrinsic ABIs for imported callees. Recover the known zero-param from_u64_unchecked constant case, hard-error recognized names that cannot be lowered, and update affected signatures, expectations, and regression coverage.

Tested with cargo make test, cargo make clippy, and cargo make format-rust.
LTO can rewrite intrinsics::felt::from_u64_unchecked to a zero-argument helper when callers only pass constants, which forced the frontend to recover arguments from optimized stub bodies.

Keep the SDK stub address-taken so the emitted Wasm retains the canonical u64 parameter. With the ABI preserved at the source, remove the fragile WAT-body scanner and special compiler fallback, leaving recognized intrinsic parameter mismatches as hard errors. Update the affected size and cycle expectations.

Tested with MIDENC_EMIT=ir=target/emit/hash_words cargo test -p miden-integration-tests rust_masm_tests::abi_transform::stdlib::test_hash_words -- --nocapture, cargo make test, cargo make clippy, and cargo make format-rust.
LTO can specialize linker stub parameters or trim signatures when the optimized program only uses a narrower call shape. That makes recognized intrinsics and Miden ABI bindings harder to validate and can prevent the frontend from lowering a stub to its canonical operation or binding path.

Make each retained generated stub address-taken inside its own body so LTO preserves the stub ABI while unused stubs remain discardable. Remove the old from_u64_unchecked-only marker and delete an unsupported advice stub with no lowering. Update the MAST and cycle expectations for the small amount of retained ABI-preservation code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix the generated code when lto=true

2 participants