Skip to content

Conversation

@zekun000
Copy link
Collaborator

@zekun000 zekun000 commented Jan 30, 2026

Summary

  • Add max_connections_per_peer config parameter to allow multiple TCP connections between the same two peers for increased throughput and reduced head-of-line blocking
  • Add enable_active_multi_connection_dialing config parameter for safe rollout:
    • When false (default): node accepts incoming connections up to max_connections_per_peer but does not actively dial for additional connections
    • When true: connectivity manager actively dials until max_connections_per_peer is reached
  • PeerManager now tracks connections by ConnectionId with round-robin selection for load balancing across connections
  • Tie-breaking logic when at max connections favors inbound connections (disconnect outbound)
  • NewPeer/LostPeer notifications only sent for first/last connection to maintain application compatibility
  • Default values: max_connections_per_peer=1, enable_active_multi_connection_dialing=false for backward compatibility

Safe Rollout Strategy

  1. First phase: Set max_connections_per_peer > 1 with enable_active_multi_connection_dialing=false
    • Nodes will passively accept multiple connections from peers that actively dial
    • Validates that PeerManager correctly handles multiple connections
  2. Second phase: Enable enable_active_multi_connection_dialing=true
    • Connectivity manager will actively establish multiple connections per peer

Test plan

  • Added E2E smoke test test_multi_connection_per_peer_validators - 2 validators with active dialing enabled
  • Added E2E smoke test test_multi_connection_allows_simultaneous_dial - 4 validators with active dialing enabled
  • Added E2E smoke test test_default_single_connection_per_peer - verifies default behavior unchanged
  • Manual testing on devnet/testnet

Generated with Claude Code

Add ability to maintain multiple TCP connections between the same two peers
for increased throughput and reduced head-of-line blocking.

Key changes:
- Add max_connections_per_peer config parameter (default: 1)
- PeerManager now tracks connections by ConnectionId with round-robin selection
- ConnectivityManager actively dials until max_connections_per_peer is reached
- Tie-breaking logic when at max connections favors inbound connections
- NewPeer/LostPeer notifications only sent for first/last connection
- E2E smoke tests added for multi-connection scenarios

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@zekun000 zekun000 added the CICD:run-forge-e2e-perf Run the e2e perf forge only label Jan 30, 2026
Zekun Li and others added 2 commits January 29, 2026 21:15
…rollout

Add a separate config flag to control active dialing for additional connections:
- enable_active_multi_connection_dialing (default: false)
- When false: only accepts incoming connections, doesn't actively dial for more
- When true: actively dials until max_connections_per_peer is reached

This allows safe rollout by first enabling passive multi-connection acceptance,
then later enabling active dialing once stability is confirmed.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…sting

- Change enable_active_multi_connection_dialing default to true for CI testing
- Fix ConnectivityManager::new calls in test.rs to add missing parameters

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite realistic_env_max_load success on c5483dab7f854496847f4573b633f8458cc3060c

two traffics test: inner traffic : committed: 13536.82 txn/s, submitted: 13536.93 txn/s, expired: 0.11 txn/s, latency: 2784.28 ms, (p50: 2700 ms, p70: 2900, p90: 3000 ms, p99: 3600 ms), latency samples: 5030460
two traffics test : committed: 100.02 txn/s, latency: 757.53 ms, (p50: 700 ms, p70: 800, p90: 900 ms, p99: 1500 ms), latency samples: 1720
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.292, avg: 2.182", "ConsensusProposalToOrdered: max: 0.169, avg: 0.166", "ConsensusOrderedToCommit: max: 0.046, avg: 0.043", "ConsensusProposalToCommit: max: 0.214, avg: 0.209"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.51s no progress at version 36812 (avg 0.07s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.30s no progress at version 2461454 (avg 0.30s) [limit 16].
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CICD:run-forge-e2e-perf Run the e2e perf forge only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant