Skip to content

Latest commit

 

History

History
252 lines (188 loc) · 9.08 KB

File metadata and controls

252 lines (188 loc) · 9.08 KB

Preamble

SEP: 0054
Title: Ledger Metadata Storage
Author: Tamir Sen <@tamirms>
Status: Draft
Created: 2025-03-11
Updated: 2026-01-07
Version: 0.2.0
Discussion: https://github.com/orgs/stellar/discussions/1678

Simple Summary

A standard for how LedgerCloseMeta objects can be stored so that ledgers can be easily and efficiently ingested by downstream systems.

Dependencies

None.

Motivation

galexie is a service which publishes LedgerCloseMeta XDR objects to a GCS (Google Cloud Storage) bucket. However, the data format and layout of the XDR objects are not formally documented. This SEP aims to provide a comprehensive specification for storing LedgerCloseMeta objects, enabling third-party developers to build compatible data stores and clients for retrieving ledger metadata.

Specification

The data store is a key-value store where:

  • Keys are strings following a specific hierarchical format.
  • Values are binary blobs representing compressed LedgerCloseMetaBatch XDR values. The one exception is the value representing the JSON config which is described later on in more detail.

The key-value store must support:

  • Efficient random access lookups on arbitrary keys.
  • Listing keys in lexicographic order, optionally filtered by a prefix.

Examples of compatible key-value stores include Google Cloud Storage (GCS) and Amazon S3.


Value Format

Each value in the key-value store is the Zstandard compressed binary encoding of the following XDR structure:

// Batch of ledgers along with their transaction metadata
struct LedgerCloseMetaBatch
{
    // starting ledger sequence number in the batch
    uint32 startSequence;

    // ending ledger sequence number in the batch
    uint32 endSequence;

    // Ledger close meta for each ledger within the batch
    LedgerCloseMeta ledgerCloseMetas<>;
};
  • A LedgerCloseMetaBatch represents a contiguous range of one or more consecutive ledgers.

  • All batches in a data store instance contain the same number of ledgers.

  • Currently only Zstandard compression is supported but it is possible to extend the SEP in the future to allow other compression algorithms.


Key Format

Keys follow a hierarchical directory structure, effectively acting as file paths within the data store. All the ledgers are stored under a configurable /<ledgers-path>. Within the /<ledgers-path> directory there are subdirectories which represent partitions. Each partition contains a fixed number of batches:

/<ledgers-path>/<partition>/<batch>.xdr.zst

If the partition size is 1, the partition is omitted, resulting in:

/<ledgers-path>/<batch>.xdr.zst

Partition Format:

fmt.Sprintf("%08X--%d-%d/", math.MaxUint32-partitionStartLedgerSequence, partitionStartLedgerSequence, partitionEndLedgerSequence)

Example for partitionStartLedgerSequence=0 and partitionEndLedgerSequence=15: FFFFFFFF--0-15

Batch Format:

 fmt.Sprintf("%08X--%d-%d.xdr.zst", math.MaxUint32-batchStartLedgerSequence, batchStartLedgerSequence, batchEndLedgerSequence)

Example for batchStartLedgerSequence=2 and batchEndLedgerSequence=3: FFFFFFFD--2-3.xdr.zst

If the batch size is 1, the format simplifies to:

 fmt.Sprintf("%08X--%d.xdr.zst", math.MaxUint32-batchStartLedgerSequence, batchStartLedgerSequence)

Example for batchStartLedgerSequence=2: FFFFFFFD--2.xdr.zst

Note the .zst suffix is the filename extension defined in the Zstandard RFC. If this SEP is extended to support another compression algorithm then the standard filename extension for the given compression algorithm will be used as a suffix in the batch name.


Configuration File

The data store includes a configuration JSON object stored under the key /<ledgers-path>/.config.json. This file contains the following properties:

  • networkPassphrase - (string) the passphrase for the Stellar network associated with the ledgers.
  • version - (string) the version of the configuration file schema which aligns with this SEP.
  • compression - (string) the compression algorithm used to compress ledger objects (currently only zstd is supported).
  • ledgersPerBatch - (integer) the number of ledgers bundled into each LedgerCloseMetaBatch.
  • batchesPerPartition - (integer) the number of batches in a partition.

Example Configuration:

{
  "networkPassphrase": "Public Global Stellar Network ; September 2015",
  "version": "0.1.0",
  "compression": "zstd",
  "ledgersPerBatch": 2,
  "batchesPerPartition": 8
}

Example Key Structure

Below is an example list of keys (with /<ledgers-path> set to /stellar/pubnet) for ledger batches based on the above example configuration:

/stellar/pubnet/.config.json
/stellar/pubnet/FFFFFFEF--16-31/FFFFFFED--18-19.xdr.zst
/stellar/pubnet/FFFFFFEF--16-31/FFFFFFEF--16-17.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFF1--14-15.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFF3--12-13.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFF5--10-11.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFF7--8-9.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFF9--6-7.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFFB--4-5.xdr.zst
/stellar/pubnet/FFFFFFFF--0-15/FFFFFFFD--2-3.xdr.zst

Note: The genesis ledger starts at sequence number 2, so the oldest batch must have a batchStartLedgerSequence of 2.


Metadata

Every LedgerCloseMetaBatch value also has the following metadata associated with it:

  • start-ledger - (string) the starting ledger of the range of ledgers enclosed in the LedgerCloseMetaBatch value.
  • end-ledger - (string) the last ledger of the range of ledgers enclosed in the LedgerCloseMetaBatch value.
  • start-ledger-close-time - (string) the unix time stamp of the close time for the starting ledger.
  • end-ledger-close-time - (string) the unix time stamp of the close time for the last ledger.
  • protocol-version - (string) the protocol version of the last ledger.
  • core-version - (string) the version of stellar-core which produced the ledgers enclosed in the LedgerCloseMetaBatch value.
  • network-passphrase - (string) the network passphrase for the Stellar network which produced the ledgers.
  • compression-type - (string) the compression algorithm used to compress the LedgerCloseMetaBatch value (currently only zstd is supported).
  • version - (string) the version of galexie used to insert the LedgerCloseMetaBatch value in the data store.

These metadata key-value pairs can be implemented as user-defined object metadata in GCS or S3. Metadata is optional for data store implementations that do not support user-defined object metadata, such as file systems and other non-S3, non-GCS data stores.

Design Rationale

Key Encoding (Reversed Ledger Sequence)

  • Lexicographic Order: Many key-value stores (e.g., GCS, S3) optimize for listing keys in lexicographic order. By encoding the most recent ledgers first, clients can efficiently retrieve the latest data without scanning the entire dataset.
  • Reversed Sequence: Using math.MaxUint32 - startLedger ensures that newer ledgers (with higher sequence numbers) appear before older ones when sorted lexicographically. This avoids the need for additional metadata or indexes to determine the latest ledger.

Compression Algorithm

  • zstd was chosen after evaluating zstd, lz4, and gzip. It provides the best balance between compression ratio and decompression speed.

Security Concerns

Verifying the validity of the ledgers contained within the data store is outside the scope of this SEP. In otherwords, this SEP does not provide any mechanism for validating that the ledgers obtained from a data store have not been altered.

Changelog

  • v0.2.0: Make metadata optional for data stores that don't support it.
  • v0.1.0: Initial draft