Ledger Metadata Storage #1678
tamirms
started this conversation in
Stellar Ecosystem Proposals
Replies: 1 comment 5 replies
-
@urvisavla pointed out the following differences between the spec and what we have implemented in galexie:
The reason I did not include the Regarding the root directory, I thought it would be useful to have a ledgers directory to separate the ledger keys from the config key. But I am open to feedback on both these points. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Discussion for #1677
Simple Summary
A standard for how
LedgerCloseMeta
objects can be stored so that ledgers can be easily and efficiently ingested by downstream systems.
Dependencies
None.
Motivation
galexie is a service which publishes
LedgerCloseMeta
XDR objects to a GCS(Google Cloud Storage) bucket. However, the data format and layout of the XDR objects are not formally documented. This
SEP aims to provide a comprehensive specification for storing LedgerCloseMeta objects, enabling third-party developers
to build compatible data stores and clients for retrieving ledger metadata.
Specification
The data store is a key-value store where:
LedgerCloseMetaBatch
XDR values.The key-value store must support:
Examples of compatible key-value stores include Google Cloud Storage (GCS) and Amazon S3.
Value Format
Each value in the key-value store is the Zstandard compressed binary encoding of
the following XDR structure:
A LedgerCloseMetaBatch represents a contiguous range of one or more consecutive ledgers.
All batches in a data store instance contain the same number of ledgers.
Currently only Zstandard compression is supported but it is possible to extend
the SEP in the future to allow other compression algorithms.
Key Format
Keys follow a hierarchical directory structure. The root directory is
/ledgers
, and subdirectories representpartitions. Each partition contains a fixed number of batches:
If the partition size is 1, the partition is omitted, resulting in:
Partition Format:
Batch Format:
If the batch size is 1, the format simplifies to:
Note the
.zst
suffix is the filename extension defined in the ZstandardRFC. If this SEP is extended to support another compression algorithm
then the standard filename extension for the given compression algorithm will be used as a suffix in the batch name.
Configuration File
The data store includes a configuration JSON object stored under the key
/config.json
. This file contains thefollowing properties:
networkPassphrase
- (string) the passphrase for the Stellar network associated with the ledgers.compression
- (string) the compression algorithm used to compress ledger objects (currently onlyzstd
is supported).ledgersPerBatch
- (integer) the number of ledgers bundled into eachLedgerCloseMetaBatch
.batchesPerPartition
- (integer) the number of batches in a partition.Example Configuration:
Example Key Structure
Below is an example list of keys for ledger batches based on the configuration above:
Note: The genesis ledger starts at sequence number 2, so the oldest batch must have a
batchStartLedgerSequence
of 2.
Design Rationale
Key Encoding (Reversed Ledger Sequence)
encoding the most recent ledgers first, clients can efficiently retrieve the latest data without scanning the entire
dataset.
math.MaxUint32 - startLedger
ensures that newer ledgers (with higher sequence numbers)appear before older ones when sorted lexicographically. This avoids the need for additional metadata or indexes to
determine the latest ledger.
Compression Algorithm
zstd
was chosen after evaluatingzstd
,lz4
, andgzip
. It provides the best balance between compression ratioand decompression speed.
Security Concerns
Verifying the validity of the ledgers contained within the data store is outside the scope of this SEP. In otherwords,
this SEP does not provide any mechanism for validating that the ledgers obtained from a data store have not been
altered.
Beta Was this translation helpful? Give feedback.
All reactions