Skip to content

Handling boundary components #25

@christian-pinto

Description

@christian-pinto

@rherrell @cayton
There are a number of cases where components are at the boundaries of multiple fabrics, or between switches within the same fabric. These components are what we refer to as "Boundary Components".

Let's make the case of a ComputerSystem (a server) connected to a CXL fabric via a switch, see picture below. The CXL switch will be under the control of a dedicated Sunfish Hardware agent. We will call it CXL Agent for the sake of this example. While the host will be managed by its own BMC and/or another Sunfish Hardware Agent. We will call it BMC for the sake of this example.

image

At system startup, the CXL Agent will advertise its own fabric including the one switch and the downstream devices. For the upstream ports, there's little assumption that the agent can do regarding the upstream ports and what is or will be connected to them. This is because 1) There might not be an entity connected to the specific ports. 2) There is an entity connected but it is powered off and therefore no link is detected. 3) The CXL Agent might not be able to identify the object on the other side of the link (i.e. its port).
Similarly for the ComputerSystem, the BMC is not aware of what the host port is connected to because: 1) a single host might not have full visibility of the entire fabric it is connected to. 2) The host might initially be powered off and therefore no link is detected on the port.

The only initial assumption that can be done is that the port is the component that sits at the boundary of a physical connection and therefore the one to be used for resolving the physical connection of boundary components at runtime.

One potential approach for resolving these conflicts is to follow the below flow:

  1. Both parties sharing a boundary component register to sunfish and report their resources as usual.
  2. Ports that are either not connected or the connection is not known at the time of the registration of the agent are marked as unresolved. This could be done by extending the Sunfish_RM field we use in the Oem property of each object to mark it with the agent it belongs to. We could add something like the below snippet.
"Sunfish_RM":{
"@odata.type": "#SunfishExtensions.v1_0_0.ResourceExtensions",
"Status" : {
  "State":"unresolved"
}
  1. Each agent populates PortID field (See redfish schema guide) with a unique identifier that is going to be fabric specific. Examples are MAC address for Etherned, CXL IDs for CXL fabrics, etc. The RemotePortID field is left empty
  2. When Sunfish scans resources from a system or agent, it caches the resources that are unresolved in a special data structure for later processing.
  3. Whenever the state of a port changes. i.e., both parties have booted and the host can "read" the unique id of he port at the other side of the link (e.g., the switch port). The agent updates its own version of the port object by populating the RemotePortID with the new port ID discovered. The agent sends an event to sunfish to signal the updated object.
  4. Sunfish at every event update checks whether the object at hand is one of the unresolved ones. This can be done by using the RemotePortID in the updated port for indexing the data structure with all unresolved items. If there's a hit sunfihs updated both ports with the complete information and marks the objects as resolved.

One drawback I see with this approach is that using the PortID for carrying the unique identifier, we lose to possibility of pysically identifying the port, which is most probaly the reason why the PortID field is there.

From the redfish spect I read

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions