Skip to content

Commit 5558f57

Browse files
committed
SIMD-0307: Add Block Header
1 parent 43fd004 commit 5558f57

File tree

1 file changed

+309
-0
lines changed

1 file changed

+309
-0
lines changed

proposals/0307-add-block-header.md

Lines changed: 309 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,309 @@
1+
---
2+
simd: '0307'
3+
title: Add Client Info to Block Header
4+
authors:
5+
- jherrera-jump (Firedancer)
6+
category: Standard
7+
type: Core
8+
status: Review
9+
created: 2025-06-17
10+
feature: <pubkey>
11+
development:
12+
- Anza - TBD
13+
---
14+
15+
## Summary
16+
17+
Add a block header to solana blocks and expose header fields in the
18+
`getBlock` rpc endpoint.
19+
20+
## Motivation
21+
22+
For the purposes of historical monitoring, development, and auditing, it is
23+
important to know exactly who produced a block and when it was produced.
24+
Currently, this information can be partially inferred from Gossip and from vote
25+
timestamps. Unfortunately there are some problems with the current approach:
26+
- The information from gossip is ephemeral. Currently a peer needs to record
27+
and persist it. This may cause synchronization issues when matching client
28+
updates in gossip with the correct slot.
29+
- Gossip lacks important information that may useful for monitoring (e.g.
30+
scheduler used, mods, configuration settings, etc).
31+
- Vote timestamps have a granularity of 1-second, so they cannot be used to
32+
estimate block duration.
33+
- Vote timestamps will be removed with Alpenglow.
34+
35+
This SIMD solves these issues by including relevant information in a static
36+
block header.
37+
38+
## New Terminology
39+
40+
No new terms, but the following definitions are given for clarity:
41+
42+
- Client - The software run by leaders to interface with a solana cluster.
43+
(e.g. `agave` or `frankendancer`)
44+
- Block Producer - The client that produced a given block
45+
- Scheduler - The system responsible for processing incoming transactions and
46+
ordering them for block construction.
47+
- Forward Error Correction set (FEC set) - A collection of shreds. At a high
48+
level, this is a construct that leverages Reed-Solomon encoding to overcome
49+
the problem of data loss from packet drops.
50+
- Shreds - A fixed chunk of encoded raw block data.
51+
- Entry Batch - An array of entries.
52+
- Entry - An array of transactions.
53+
54+
## Detailed Design
55+
56+
### Data Layout
57+
58+
Solana blocks are organized in abstraction layers not entirely unlike the
59+
arrangement of a typical network packet (e.g. MAC -> IP -> TCP -> HTTP). At the
60+
highest layer a block consists of some number (~100+) FEC sets. A single FEC
61+
set contains a handful of shreds (~32). Once sufficient shreds are available
62+
the raw block data is reconstructed and reinterpreted as an array of entry
63+
batches. Entry batches do not cross shred boundaries.
64+
65+
This SIMD add the following header at the beginning of the raw block data. This
66+
puts it on the same abstraction layer as serialized entry batch data. Put
67+
differently, the serialized header will be prepended to the first serialized
68+
entry batch in the block.
69+
70+
```
71+
< -- 64 bits -->
72+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
73+
| block_header_flag |
74+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
75+
| version |
76+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
77+
| header_length |
78+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
79+
| block_producer_time_nanos |
80+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
81+
| block_user_agent |
82+
| |
83+
⋮ +30 ⋮
84+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
85+
| ... future fields ... |
86+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
87+
```
88+
89+
- `block_header_flag: u64` will always be zero. The first 8 bytes of an entry
90+
batch are always a positive number (the number of entries in the batch), so
91+
this flag allows parsers to differentiate between a normal entry batch and one
92+
with a header prepended. Though not strictly necessary, this may facilitate
93+
parsing block data, and allows us to make the header optional if we ever need
94+
to.
95+
96+
- `version: u64` is a positive integer which changes anytime a change is made to the
97+
header. The initial version will be 1.
98+
99+
- `header_length: u64` is the length of the rest of the header in bytes (i.e. not
100+
including the `block_header_flag`, `version`, and `header_length` fields).
101+
102+
- `block_producer_time_nanos: u64` is a nanosecond UNIX timestamp representing the
103+
time when the block producer became leader and started constructing the block.
104+
105+
- `block_user_agent: [u8; 256]` is a string that provides identifying information about the
106+
block producer.
107+
108+
- `future fields` any other fields that are deemed necessary in the future may be
109+
added with a corresponding change to `version` / `header_length`. For example, SIMD
110+
[0298](https://github.com/solana-foundation/solana-improvement-documents/pull/298)
111+
proposes a field header, which could be added as a subsequent SIMD (or folded into this one).
112+
113+
### Header Field Specification
114+
115+
Header fields will be unilaterally populated by their respective block producer
116+
without any enforced constraint on their contents. This SIMD includes the
117+
following fields in the header
118+
119+
- `block_producer_time_nanos`: u64
120+
- `block_user_agent`: [u8; 256]
121+
122+
Because it is desirable to maintain cluster-wide diagnostics this SIMD provides
123+
a suggested format for the `block_user_agent` string which includes basic
124+
information about the block producer. This should be an UTF-8 encoded, null
125+
terminated string. The null character should terminate valid UTF-8 data. Any
126+
data following the null character is ignored by parsers and may contain
127+
arbitrary information. It is expected that all producers use this format,
128+
though this will not be enforced. Clients that choose to opt out of the
129+
suggested format should set the first byte of the field to 0 (i.e. the null
130+
character). The format is loosely based on HTTP `user-agent` header format
131+
specification:
132+
133+
```
134+
<product>/<product-version> <comment>
135+
```
136+
137+
The first entry will always be the software client.
138+
139+
```
140+
client/client_version <client_details>
141+
```
142+
143+
Options for `client` currently include:
144+
- `agave`
145+
- `frankendancer`
146+
- `firedancer`
147+
148+
`client_version` should be consistent with the information stored on-chain (in
149+
`ConfigProgram`). Software forks (e.g. `jito-agave`) should put one of
150+
the 3 base clients and can specify details about the fork in the comment.
151+
152+
The comment should be in parentheses and contain a semicolon separated
153+
list of flags. A flag has an unrestricted format, but should represent a
154+
feature that is contained and enabled in the client it describes.
155+
156+
e.g.
157+
158+
```
159+
agave/v2.2.15 (jito; doublezero; some-mod/v1.2.3)
160+
```
161+
162+
Sometimes there may be software that coexists or runs alongside a validator
163+
client. For example, current client development aims to make the transaction
164+
scheduler modular, which would allow the transaction scheduler to be developed
165+
independently from the client codebase. Validator clients that use
166+
complementary software like this should add additional
167+
`<product>/<product-version> <comment>` entries in the user agent string.
168+
169+
For example:
170+
171+
```
172+
agave/v3.0.0 (doublezero) greedy-scheduler/v3 (mode:perf; another-flag)
173+
```
174+
175+
### RPC Protocol Changes
176+
177+
The `getBlock` RPC response will be extended to, optionally, include all header
178+
fields. The request will be extended with the `header` parameter, which lets
179+
the client signal that they want the header fields in the response. By default,
180+
header fields will be included in the response.
181+
182+
Sample Request Payload
183+
```json
184+
{
185+
"jsonrpc": "2.0",
186+
"id": 1,
187+
"method": "getBlock",
188+
"params": [
189+
378967388,
190+
{
191+
"encoding": "json",
192+
"maxSupportedTransactionVersion": 0,
193+
"transactionDetails": "full",
194+
"rewards": false,
195+
"header": true
196+
}
197+
]
198+
}
199+
```
200+
201+
Sample Response Payload
202+
```json
203+
{
204+
"jsonrpc": "2.0",
205+
"result": {
206+
"blockHeight": 428,
207+
"blockTime": null,
208+
"blockhash": "3Eq21vXNB5s86c62bVuUfTeaMif1N2kUqRPBmGRJhyTA",
209+
"parentSlot": 429,
210+
"previousBlockhash": "mfcyqEXB3DnHXki6KjjmZck6YjmZLvpAByy2fj4nh6B",
211+
"header": {
212+
"blockProducerTimeNanos": 1750176982899968023,
213+
"blockUserAgent": "agave/v3.0.0 (doublezero) greedy-scheduler/v3 (mode:perf; another-flag)",
214+
},
215+
"transactions": [
216+
{
217+
"meta": {
218+
"err": null,
219+
"fee": 5000,
220+
"innerInstructions": [],
221+
"logMessages": [],
222+
"postBalances": [499998932500, 26858640, 1, 1, 1],
223+
"postTokenBalances": [],
224+
"preBalances": [499998937500, 26858640, 1, 1, 1],
225+
"preTokenBalances": [],
226+
"rewards": null,
227+
"status": {
228+
"Ok": null
229+
}
230+
},
231+
"transaction": {
232+
"message": {
233+
"accountKeys": [
234+
"3UVYmECPPMZSCqWKfENfuoTv51fTDTWicX9xmBD2euKe",
235+
"AjozzgE83A3x1sHNUR64hfH7zaEBWeMaFuAN9kQgujrc",
236+
"SysvarS1otHashes111111111111111111111111111",
237+
"SysvarC1ock11111111111111111111111111111111",
238+
"Vote111111111111111111111111111111111111111"
239+
],
240+
"header": {
241+
"numReadonlySignedAccounts": 0,
242+
"numReadonlyUnsignedAccounts": 3,
243+
"numRequiredSignatures": 1
244+
},
245+
"instructions": [
246+
{
247+
"accounts": [1, 2, 3, 0],
248+
"data": "37u9WtQpcm6ULa3WRQHmj49EPs4if7o9f1jSRVZpm2dvihR9C8jY4NqEwXUbLwx15HBSNcP1",
249+
"programIdIndex": 4
250+
}
251+
],
252+
"recentBlockhash": "mfcyqEXB3DnHXki6KjjmZck6YjmZLvpAByy2fj4nh6B"
253+
},
254+
"signatures": [
255+
"2nBhEBYYvfaAe16UMNqRHre4YNSskvuYgx3M6E4JP1oDYvZEJHvoPzyUidNgNX5r9sTyN1J9UxtbCXy2rqYcuyuv"
256+
]
257+
}
258+
}
259+
]
260+
},
261+
"id": 1
262+
}
263+
```
264+
265+
## Alternatives Considered
266+
267+
- Do nothing
268+
- We can't estimate block time / duration with sufficient granularity. We
269+
won't be able to estimate at all when votes are changed in alpenglow.
270+
- We will continue to have an incomplete, ephemeral record of who produced
271+
blocks.
272+
- derive timestamp header field from consensus and enforce user agent format
273+
- This can and probably should be implemented as a future SIMD. Meanwhile,
274+
these fields are still useful since
275+
1. most of the cluster is expected to
276+
be honest, so monitoring tools may still use them for cluster-wide
277+
analytics and
278+
2. block producers still use these fields to self-monitor
279+
their performance.
280+
- Send block producer information via gossip instead
281+
- The information is short-lived and depends on physical network availability
282+
- Update this information in an on-chain account instead (e.g. ConfigProgram)
283+
- Same issue as above, the information is short-lived.
284+
285+
## Impact
286+
287+
This change will enable more reliable monitoring and benchmarking for operators
288+
and for the community. Clients and indexers will need to extend both in-memory
289+
and long-term block storage to be aware of the new columns added to the block
290+
header. The client rpc engine will need to change to support the new fields.
291+
292+
## Security Considerations
293+
294+
- The header fields are untrusted and purely informational. Tools that expose
295+
these fields to external users should clearly communicate their untrusted
296+
nature.
297+
298+
## Drawbacks
299+
300+
- No expected drawbacks beyond minimal resource overhead.
301+
302+
## Backwards Compatibility
303+
304+
- RPC requests for old slots should properly document and return a suitable
305+
default value (e.g. None).
306+
- Clients that don't implement this SIMD will reject new blocks because they
307+
will fail to parse the new header.
308+
- Because this header is mandatory, leaders that produce blocks without a
309+
header will skip, since the header is required.

0 commit comments

Comments
 (0)