Skip to content

Feat: multi-transport support with dual P2P and RDMA && enhance bnxt provider#160

Open
jhchouuu wants to merge 6 commits intomainfrom
jiahzhou/shmem_multi_transport_support
Open

Feat: multi-transport support with dual P2P and RDMA && enhance bnxt provider#160
jhchouuu wants to merge 6 commits intomainfrom
jiahzhou/shmem_multi_transport_support

Conversation

@jhchouuu
Copy link
Collaborator

@jhchouuu jhchouuu commented Feb 4, 2026

Motivation

multi-transport support with dual P2P and RDMA && enhance bnxt provider

Technical Details

1. Dual Data Path Architecture (P2P + RDMA)

Introduced p2pPeerPtrs alongside existing peerPtrs to maintain separate data paths:

  • p2pPeerPtrs: Always established for same-node peers

    • Self: Contains localPtr
    • Same-node peers: P2P accessible pointers
    • Different-node peers: nullptr
    • Independent of transport type selection
  • peerPtrs: Transport-specific addressing

    • RDMA transport: Remote virtual addresses
    • P2P/SDMA transport: P2P pointers

2. More SHMEM Device P2P Pointer APIs

uint64_t ShmemPtrP2p(const uint64_t destPtr, int destPe, const int myPe);
uint64_t ShmemPtrP2p(const uint64_t destPtr, int destPe);
uint64_t ShmemPtrP2p(const SymmMemObjPtr& memObjPtr, size_t offset, int destPe);

3. BNXT RDMA Provider Fixes

  • Added MORI_BNXT_ENABLE_UDP_SPORT environment variable for UDP sport configuration
  • Fix cleanup bugs of double registration/unregistration of shared UAR when using default DB region

Test Result

  • UT test_p2p_direct_access passed on Thor2 platform and AINIC platform for 3 shmem mode
  • EP16 benchmark passed on AINIC platform for 3 shmem mode

- Add p2pPeerPtrs field to preserve P2P pointers for same-node peers
- Add Context::CanUseP2P() to identify P2P capability independent of transport type
- Update all memory modes (isolation, static heap, VMM) to initialize both pointer arrays
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant