You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the SPV client is in a transient state where the masternode/quorum cache hasn't caught up (e.g. during mid-session wallet import, cold start, network switch, or wake-from-sleep), any DAPI-backed backend task can cause all connected DAPI nodes to be banned in rapid succession. The wallet's DAPI connectivity becomes unusable until bans expire or SPV catches up and proofs can be re-verified.
Impact
User-observable: "Platform unavailable" / repeated DAPI timeouts for seconds to minutes after an SPV state transition.
Amplified by recent mid-session wallet import handling (PR #830's programmatic SpvManager::restart()) because the restart deliberately re-initialises SPV state while the UI stays up and user actions can continue to fire DAPI traffic.
The ban happens upstream in rs-dapi-client, not in DET. The chain:
An AppContext::run_backend_task variant that touches Platform dispatches an SDK call.
The SDK receives a proof-carrying response from a DAPI node. To verify the proof it calls SpvProvider::get_quorum_public_key(...) (src/context_provider_spv.rs:89).
SpvProvider delegates to SpvManager::get_quorum_public_key(...) (src/spv/manager.rs:987). If the requested quorum isn't in the in-memory masternode/quorum cache, it returns an error.
Today that error is ContextProviderError::Generic(_) (src/context_provider_spv.rs:107), which bubbles up as drive_proof_verifier::Error::ContextProviderError(_) and then dash_sdk::error::Error::Proof(_).
rs-sdk's impl CanRetry for Error (platform/packages/rs-sdk/src/error.rs:266-272) classifies Error::Proof(_) as retryable.
update_address_ban_status in platform/packages/rs-dapi-client/src/dapi_client.rs:186-218 treats retryable-but-failed responses as bad-node signals and calls AddressList::ban().
The SDK cannot currently distinguish "remote returned a bad proof" (node is misbehaving, ban is correct) from "my local context couldn't verify the proof because my quorum cache isn't ready yet" (node is fine, retry later). Both paths take the same Error::Proof branch.
Reproduction
Testnet. Launch DET after SPV data has been cleared, so SPV starts syncing from checkpoint.
Import any wallet via GUI or MCP core_wallet_import.
While SPV is still reconciling quorums for recent heights, trigger any DAPI-backed action (viewing identities, opening Platform Info, balance refresh, etc.).
Observe in logs: banned events for each connected DAPI node.
After a short delay (SPV catches up), DAPI requests start working again.
Why existing safeguards don't cover this
MCP tools have mcp::resolve::ensure_spv_synced at src/mcp/resolve.rs:117-137 which polls ConnectionStatus::overall_state() == Synced with a 10-minute ceiling before dispatching any wallet-facing MCP tool. This works: MCP paths don't exhibit the ban behaviour.
GUI backend-task dispatch has no equivalent gate.AppContext::run_backend_task (src/backend_task/mod.rs:409) goes straight to the SDK. Every GUI action that spawns a DAPI call during a transient contributes to the ban wave.
Hoist ensure_spv_synced logic out of src/mcp/resolve.rs into a shared helper (e.g. AppContext::await_spv_ready(timeout: Duration) -> Result<(), TaskError>).
Call it from AppContext::run_backend_task for every DAPI-using variant before the SDK invocation (audit and list variants in PR body).
On timeout, return a new TaskError::SpvNotReady variant with a user-facing message per CLAUDE.md error-messaging rules, e.g. "Background sync is catching up. Try again in a moment." — the existing MessageBanner surfaces this automatically.
Non-DAPI variants (SPV-only, local DB reads) continue to bypass the gate.
At src/context_provider_spv.rs:107, return ContextProviderError::InvalidQuorum(_) (or a new purpose-fit variant upstream, if one is appropriate) instead of Generic(_).
No immediate client behaviour change — rs-sdk's CanRetry still treats it as retryable. But the typed discriminator lets a future upstream refinement distinguish "retry without banning" from "ban and retry" without string-matching.
Rejected alternatives (for the record)
Pause the SDK during SPV restart — no public pause API in rs-sdk; multi-day build for no additional ban-reduction benefit over gating at the task level.
DAPI request queueing shim in DET — duplicates AddressList ordering/retry logic, adds a second source of truth for node health. Poor observability.
Proof verification bypass (see Optional proof verification bypass mode (feature request) #283) — user-facing toggle exists there as a dev-mode feature request. Different scope: that one assumes Core is offline and the user explicitly opts out; this one is about transient readiness during normal operation.
Upstream follow-up
A clean long-term fix should happen in platform/packages/rs-sdk:
Error::Proof carrying a ContextProviderError::InvalidQuorum variant should be classified as retry-without-ban rather than retry-with-ban-on-failure. rs-dapi-client::update_address_ban_status would honour that classification.
Consider a new CanRetry return value (e.g. RetryPreservingAddress) distinct from Retry for this case.
Will file a separate upstream issue against dashpay/platform if maintainers agree the DET-side gate is insufficient on its own.
Related
PR #830 — surfaces the bug most visibly via mid-session import restart, but not the cause.
Summary
When the SPV client is in a transient state where the masternode/quorum cache hasn't caught up (e.g. during mid-session wallet import, cold start, network switch, or wake-from-sleep), any DAPI-backed backend task can cause all connected DAPI nodes to be banned in rapid succession. The wallet's DAPI connectivity becomes unusable until bans expire or SPV catches up and proofs can be re-verified.
Impact
SpvManager::restart()) because the restart deliberately re-initialises SPV state while the UI stays up and user actions can continue to fire DAPI traffic.Root cause
The ban happens upstream in
rs-dapi-client, not in DET. The chain:AppContext::run_backend_taskvariant that touches Platform dispatches an SDK call.SpvProvider::get_quorum_public_key(...)(src/context_provider_spv.rs:89).SpvProviderdelegates toSpvManager::get_quorum_public_key(...)(src/spv/manager.rs:987). If the requested quorum isn't in the in-memory masternode/quorum cache, it returns an error.ContextProviderError::Generic(_)(src/context_provider_spv.rs:107), which bubbles up asdrive_proof_verifier::Error::ContextProviderError(_)and thendash_sdk::error::Error::Proof(_).rs-sdk'simpl CanRetry for Error(platform/packages/rs-sdk/src/error.rs:266-272) classifiesError::Proof(_)as retryable.update_address_ban_statusinplatform/packages/rs-dapi-client/src/dapi_client.rs:186-218treats retryable-but-failed responses as bad-node signals and callsAddressList::ban().The SDK cannot currently distinguish "remote returned a bad proof" (node is misbehaving, ban is correct) from "my local context couldn't verify the proof because my quorum cache isn't ready yet" (node is fine, retry later). Both paths take the same
Error::Proofbranch.Reproduction
core_wallet_import.bannedevents for each connected DAPI node.Why existing safeguards don't cover this
mcp::resolve::ensure_spv_syncedatsrc/mcp/resolve.rs:117-137which pollsConnectionStatus::overall_state() == Syncedwith a 10-minute ceiling before dispatching any wallet-facing MCP tool. This works: MCP paths don't exhibit the ban behaviour.AppContext::run_backend_task(src/backend_task/mod.rs:409) goes straight to the SDK. Every GUI action that spawns a DAPI call during a transient contributes to the ban wave.Proposed fix
Two-part, small surface area:
1. Primary fix — gate DAPI-using backend tasks on SPV readiness (~1 day)
ensure_spv_syncedlogic out ofsrc/mcp/resolve.rsinto a shared helper (e.g.AppContext::await_spv_ready(timeout: Duration) -> Result<(), TaskError>).AppContext::run_backend_taskfor every DAPI-using variant before the SDK invocation (audit and list variants in PR body).TaskError::SpvNotReadyvariant with a user-facing message perCLAUDE.mderror-messaging rules, e.g. "Background sync is catching up. Try again in a moment." — the existingMessageBannersurfaces this automatically.2. Secondary fix — distinct error variant for future SDK classification (~30 min)
src/context_provider_spv.rs:107, returnContextProviderError::InvalidQuorum(_)(or a new purpose-fit variant upstream, if one is appropriate) instead ofGeneric(_).rs-sdk'sCanRetrystill treats it as retryable. But the typed discriminator lets a future upstream refinement distinguish "retry without banning" from "ban and retry" without string-matching.Rejected alternatives (for the record)
rs-sdk; multi-day build for no additional ban-reduction benefit over gating at the task level.AddressListordering/retry logic, adds a second source of truth for node health. Poor observability.Upstream follow-up
A clean long-term fix should happen in
platform/packages/rs-sdk:Error::Proofcarrying aContextProviderError::InvalidQuorumvariant should be classified as retry-without-ban rather than retry-with-ban-on-failure.rs-dapi-client::update_address_ban_statuswould honour that classification.CanRetryreturn value (e.g.RetryPreservingAddress) distinct fromRetryfor this case.Will file a separate upstream issue against
dashpay/platformif maintainers agree the DET-side gate is insufficient on its own.Related
Relevant paths
src/context_provider_spv.rs:89-108src/spv/manager.rs:987-1046get_quorum_public_key— returns error when quorum not cachedsrc/context/connection_status.rs:279-330overall_state()— readiness signalsrc/mcp/resolve.rs:107-137src/backend_task/mod.rs:409-507src/backend_task/error.rsTaskError::SpvNotReadyvariant heresrc/sdk_wrapper.rs:14-20ban_failed_address: trueSDK configuration🤖 Co-authored by Claudius the Magnificent AI Agent