Skip to content

Rearrange data in key SST data structures to reduce padding #1372

@stonea

Description

@stonea

Please describe the new features with any relevant use cases.

I'm trying to scale SST to use massive numbers of components and links. At these scales memory usage becomes concern, this issue proposes something we could do to save memory ---

Specifically, we could try and rearrange data to avoid extraneous padding in data structures. I wouldn't expect this dramatically save memory, but it does seem like a relatively easy step to take.

Describe the proposed solution you plan to implement

I would propose manually rearranging the member data in these classes (for example, going from largest to smallest). And then remeasuring the amount of padding. If doing so causes an appreciable difference, then I would merge these changes.

Testing plan

Adding robust testing for this enhancement may be somewhat difficult.

We could add some mechanism to have SST report on the sizeof these various classes, and then hard-code what we expected their sizes to be for the particular testing platform(s) SST uses for AutoTesting; however, this would specialize the test to these given platforms, which I'm guessing would not be desirable.

Adding a test would also mean that, that test would have to be updated anytime a new field is added, although that may be desirable as it would serve as a reminder for the developer to consider the impact the placement of this added data would have on padding.

I'd like more feedback about what testing should be added for this enhancement, if any.

Additional context

To give a sense of how much padding there is:

On my system with SST built using Cray Clang 19.0, I estimate that we have the following padding these data-structures:

Class             sizeof(Class)   sum field sizes     padding (difference)
--------------------------------------------------------------------------
Params            112             105                 7
ConfigStatistic   168             146                 15
ConfigComponent   592             538                 25
ConfigLink        216             209                 7
ComponentInfo     280             262                 18
Link              64              64                  0

I computed this by summing the sizeof individual elements in each class (sum of field sizes) and comparing that against the reported sizeof the object itself.


As a quick experiment, I rearranging the data members in ConfigComponent as shown below. This resulted in sizeof(ConfigComponent) reducing from 592 bytes to 568 bytes; a savings of 24 bytes per config component.

    ConfigStatistic                      allStatConfig;
    Params                params;        /*!< Set of Parameters */
    RankInfo              rank;          /*!< Parallel Rank for this component */
    std::map<std::string, StatisticId_t> enabledStatNames;
    std::vector<ConfigComponent*> subComponents; /*!< List of subcomponents */
    std::vector<LinkId_t> links;         /*!< List of links connected */
    // std::vector<ConfigStatistic>  enabledStatistics; /*!< List of subcomponents */
    std::vector<double>           coords;
    std::string           name;          /*!< Name of this component, or slot name for subcomp */
    std::string           type;          /*!< Type of this component */
    ComponentId_t         id;            /*!< Unique ID of this component */
    ConfigGraph*          graph;         /*!< Graph that this component belongs to */
    float                 weight;        /*!< Partitioning weight for this component */
    int                   slot_num;      /*!< Slot number.  Only valid for subcomponents */
    uint16_t nextSubID;  /*!< Next subID to use for children, if component, if subcomponent, subid of parent */
    uint16_t nextStatID; /*!< Next statID to use for children */
    uint8_t               statLoadLevel; /*!< Statistic load level for this component */
    bool     visited;    /*! Used when traversing graph to indicate component was visited already */
    bool                                 enabledAllStats;

An alternative solution we could consider: various compilers have directives to indicate that data should be packed so that there is not padding between members. For example, gcc has __attribute__((packed)). This may lead to some additional memory savings, but would be at the cost of performance. Furthermore, these attributes are compiler-specific, we could add additional code to detect the compiler being used and adjust, but this adds complexity to the code and this enhancement would only apply to the set of compilers we support it for.


I've heard that there's been some recent efforts that may impact what fields are in SST's core data structures. As such, I'll defer working on this until I've heard that things have stabilized. In the meantime, I wanted to get this feature documented as an issue; I'd be happy to pick it up whenever the time is right.

Metadata

Metadata

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions