Summary
After #110 closed the stale-UTxO window post-reconnect, Antithesis still surfaces both tx_generator_refill_submit_rejected and tx_generator_transact_submit_rejected Always-assertions. The relay's rejection reason — captured verbatim from the report — is not a stale-input rejection; it's a duplicate-submit-after-reconnect:
ConwayMempoolFailure "All inputs are spent. Transaction has probably already been included"
Source: https://cardano.antithesis.com/report/tilehuSggX4cnuy5qyXfwpqI/2ZUJSYUipLqm3Dlbo9R3rjrS7i7dYJ_mc8FwikFqYLg.html — example vtime=182.072s, id=tx_generator_refill_submit_rejected.
Root cause
The race, mapped to the daemon's submit path in lib/Cardano/Node/Client/TxGenerator/Daemon.hs:
| Step |
Code |
What happens |
| 1 |
queryUTxOs provider faucetAddr (refill) / queryUTxOs provider srcAddr (transact) |
LSQ returns UTxO X — current under the indexer's view, also unspent on the relay's chain at this moment |
| 2 |
refillTx / transactTx |
Daemon builds Tx1 with X as input |
| 3 |
submit submitter signed (i.e. submitTxN2C via the LTxS channel) |
Wire write to relay succeeds; relay accepts Tx1 into its mempool |
| 4 |
Bearer dies before MsgAcceptTx round-trips back |
BlockedIndefinitelyOnSTM → caught in LocalTxSubmission.submitTxN2C, re-raised as ConnectionLost |
| 5 |
Daemon arm's E.handle ConnectionLost returns RefillFail/TransactFail IndexNotReady (today) |
Composer treats the tick as not-applicable, retries on the next tick |
| 6 |
Supervisor reconnects, rsIndexFresh clears, then flips true on the next chain-sync block |
Tx1 may or may not have been included in the new chain head by then |
| 7 |
Composer fires another refill/transact |
Daemon arm runs queryUTxOs again, gets X back (the indexer's UTxO view either hasn't observed Tx1's effect yet, or rolled back through it) |
| 8 |
Daemon builds Tx2 with the same input X |
Different TxId from Tx1 because of seed/randomness, but same input |
| 9 |
Submit Tx2 |
Relay's chain has Tx1 included → X is spent → ConwayMempoolFailure "All inputs are spent..." |
The freshness gate (#109/#110) helps step 7 only when the indexer can re-sync within one block of the prior submission landing. Under aggressive fault injection (4093 disconnect/reconnect cycles in 1h, per 685fa5e run), bursts of reconnects within a single block window leave the gate insufficient.
This is the standard on-chain-tx-submission idempotency problem: from the daemon's local perspective, a submit that elicited ConnectionLost is indeterminate — it might have landed, or it might not.
Recommended fix (single, minimal-scope path)
Pre-submit chain-tip query: before calling submit submitter signed in both runRefillArm and runTransactArm, verify that the chosen input(s) are still unspent against the relay's current chain tip via LSQ.
Why this is the right first step
- Closes the dominant window: between the prior submit's landing and the next arm's tx-build, the daemon now sees a freshly-landed Tx1's effect on chain and refuses to spend X again.
- Cheap: one extra
GetUTxOByTxIn LSQ round-trip per submit attempt. LSQ is already in our N2C plumbing.
- Fail-safe: if the input isn't on the relay's current view, treat it as
IndexNotReady and let the composer retry — same wire-stable response we already use.
- Doesn't require remembering in-flight txs across reconnects, which would otherwise need new persistence.
- Bisect-safe: pure addition; no existing callers change behavior under the happy path.
Specific changes
-
New helper in lib/Cardano/Node/Client/TxGenerator/Selection.hs (or a dedicated Submit.hs if it grows):
-- | Verify each input is still unspent at the relay's current
-- volatile tip. Single LSQ round-trip via @GetUTxOByTxIn@.
verifyInputsUnspent ::
Provider IO ->
Set TxIn ->
IO Bool
Returns False if any input is missing from the tip's UTxO set.
-
Wire into lib/Cardano/Node/Client/TxGenerator/Daemon.hs in two sites:
buildSignSubmit (refill path, around the existing submit submitter signed call) — guards the single faucet input.
- The transact path's submit site — guards the K source inputs.
On verifyInputsUnspent → False, return RefillFail/TransactFail IndexNotReady without incrementing the next-HD-index. Same retry semantics as the existing IndexNotReady paths.
-
Tests:
test/Cardano/Node/Client/E2E/TxGeneratorSubmitIdempotenceSpec.hs — boot devnet via withRestartableCardanoNode, drive a refill, restart relay, drive a SECOND refill that would otherwise re-submit the same input, assert no ApplyTxErr carrying "already been included", assert daemon process stays alive.
- Unit test in
test/Cardano/Node/Client/TxGenerator/SelectionSpec.hs — pure check that verifyInputsUnspent correctly returns False when the LSQ stub omits a queried input.
Out of scope for this issue
Acceptance
A single Antithesis 1h cardano_node_tx_generator run on the downstream bump PR (cardano-foundation/cardano-node-antithesis#98), against a pin that includes this fix, shows:
- 0
tx_generator_refill_submit_rejected Always-assertion failures.
- 0
tx_generator_transact_submit_rejected Always-assertion failures.
- The supervisor still triggers ≥3000
Disconnected/Reconnecting events (i.e. fault injection wasn't softened — we eliminated the false positive at the daemon side, not by reducing chaos).
Plus the new E2E spec passes locally:
nix develop -c cabal test e2e-tests \
--test-options='--match "tx-generator submit idempotence"'
Related
Summary
After #110 closed the stale-UTxO window post-reconnect, Antithesis still surfaces both
tx_generator_refill_submit_rejectedandtx_generator_transact_submit_rejectedAlways-assertions. The relay's rejection reason — captured verbatim from the report — is not a stale-input rejection; it's a duplicate-submit-after-reconnect:Source: https://cardano.antithesis.com/report/tilehuSggX4cnuy5qyXfwpqI/2ZUJSYUipLqm3Dlbo9R3rjrS7i7dYJ_mc8FwikFqYLg.html — example
vtime=182.072s,id=tx_generator_refill_submit_rejected.Root cause
The race, mapped to the daemon's submit path in
lib/Cardano/Node/Client/TxGenerator/Daemon.hs:queryUTxOs provider faucetAddr(refill) /queryUTxOs provider srcAddr(transact)refillTx/transactTxsubmit submitter signed(i.e.submitTxN2Cvia the LTxS channel)MsgAcceptTxround-trips backBlockedIndefinitelyOnSTM→ caught inLocalTxSubmission.submitTxN2C, re-raised asConnectionLostE.handle ConnectionLostreturnsRefillFail/TransactFail IndexNotReady(today)rsIndexFreshclears, then flips true on the next chain-sync blockqueryUTxOsagain, gets X back (the indexer's UTxO view either hasn't observed Tx1's effect yet, or rolled back through it)ConwayMempoolFailure "All inputs are spent..."The freshness gate (#109/#110) helps step 7 only when the indexer can re-sync within one block of the prior submission landing. Under aggressive fault injection (4093 disconnect/reconnect cycles in 1h, per
685fa5erun), bursts of reconnects within a single block window leave the gate insufficient.This is the standard on-chain-tx-submission idempotency problem: from the daemon's local perspective, a submit that elicited
ConnectionLostis indeterminate — it might have landed, or it might not.Recommended fix (single, minimal-scope path)
Pre-submit chain-tip query: before calling
submit submitter signedin bothrunRefillArmandrunTransactArm, verify that the chosen input(s) are still unspent against the relay's current chain tip via LSQ.Why this is the right first step
GetUTxOByTxInLSQ round-trip per submit attempt. LSQ is already in our N2C plumbing.IndexNotReadyand let the composer retry — same wire-stable response we already use.Specific changes
New helper in
lib/Cardano/Node/Client/TxGenerator/Selection.hs(or a dedicatedSubmit.hsif it grows):Returns
Falseif any input is missing from the tip's UTxO set.Wire into
lib/Cardano/Node/Client/TxGenerator/Daemon.hsin two sites:buildSignSubmit(refill path, around the existingsubmit submitter signedcall) — guards the single faucet input.On
verifyInputsUnspent → False, returnRefillFail/TransactFail IndexNotReadywithout incrementing the next-HD-index. Same retry semantics as the existingIndexNotReadypaths.Tests:
test/Cardano/Node/Client/E2E/TxGeneratorSubmitIdempotenceSpec.hs— boot devnet viawithRestartableCardanoNode, drive a refill, restart relay, drive a SECOND refill that would otherwise re-submit the same input, assert noApplyTxErrcarrying"already been included", assert daemon process stays alive.test/Cardano/Node/Client/TxGenerator/SelectionSpec.hs— pure check thatverifyInputsUnspentcorrectly returns False when the LSQ stub omits a queried input.Out of scope for this issue
tx_generator_*_submit_rejected(still tracked at fix(composer): tx_generator_*_submit_rejected too strict during fault-injection reconnect window cardano-foundation/cardano-node-antithesis#107).Acceptance
A single Antithesis 1h
cardano_node_tx_generatorrun on the downstream bump PR (cardano-foundation/cardano-node-antithesis#98), against a pin that includes this fix, shows:tx_generator_refill_submit_rejectedAlways-assertion failures.tx_generator_transact_submit_rejectedAlways-assertion failures.Disconnected/Reconnectingevents (i.e. fault injection wasn't softened — we eliminated the false positive at the daemon side, not by reducing chaos).Plus the new E2E spec passes locally:
Related