Skip to content

Commit 911787c

Browse files
committed
SIMD-0307: Add Block Header
1 parent 43fd004 commit 911787c

File tree

1 file changed

+316
-0
lines changed

1 file changed

+316
-0
lines changed

proposals/0307-add-block-header.md

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
---
2+
simd: '0307'
3+
title: Add Client Info to Block Header
4+
authors:
5+
- jherrera-jump (Firedancer)
6+
category: Standard
7+
type: Core
8+
status: Review
9+
created: 2025-06-17
10+
feature: <pubkey>
11+
development:
12+
- Anza - TBD
13+
---
14+
15+
## Summary
16+
17+
Add a block header to solana blocks and expose header fields in the
18+
`getBlock` rpc endpoint.
19+
20+
## Motivation
21+
22+
For the purposes of historical monitoring, development, and auditing, it is
23+
important to know exactly who produced a block and when it was produced.
24+
Currently, this information can be partially inferred from Gossip and from vote
25+
timestamps. Unfortunately there are some problems with the current approach:
26+
27+
- The information from gossip is ephemeral. Currently a peer needs to record
28+
and persist it. This may cause synchronization issues when matching client
29+
updates in gossip with the correct slot.
30+
- Gossip lacks important information that may useful for monitoring (e.g.
31+
scheduler used, mods, configuration settings, etc).
32+
- Vote timestamps have a granularity of 1-second, so they cannot be used to
33+
estimate block duration.
34+
- Vote timestamps will be removed with Alpenglow.
35+
36+
This SIMD solves these issues by including relevant information in a static
37+
block header.
38+
39+
## New Terminology
40+
41+
No new terms, but the following definitions are given for clarity:
42+
43+
- Client - The software run by leaders to interface with a solana cluster.
44+
(e.g. `agave` or `frankendancer`)
45+
- Block Producer - The client that produced a given block
46+
- Scheduler - The system responsible for processing incoming transactions and
47+
ordering them for block construction.
48+
- Forward Error Correction set (FEC set) - A collection of shreds. At a high
49+
level, this is a construct that leverages Reed-Solomon encoding to overcome
50+
the problem of data loss from packet drops.
51+
- Shreds - A fixed chunk of encoded raw block data.
52+
- Entry Batch - An array of entries.
53+
- Entry - An array of transactions.
54+
55+
## Detailed Design
56+
57+
### Data Layout
58+
59+
Solana blocks are organized in abstraction layers not entirely unlike the
60+
arrangement of a typical network packet (e.g. MAC -> IP -> TCP -> HTTP). At the
61+
highest layer a block consists of some number (~100+) FEC sets. A single FEC
62+
set contains a handful of shreds (~32). Once sufficient shreds are available
63+
the raw block data is reconstructed and reinterpreted as an array of entry
64+
batches. Entry batches do not cross shred boundaries.
65+
66+
This SIMD add the following header at the beginning of the raw block data. This
67+
puts it on the same abstraction layer as serialized entry batch data. Put
68+
differently, the serialized header will be prepended to the first serialized
69+
entry batch in the block.
70+
71+
```
72+
< -- 64 bits -->
73+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
74+
| block_header_flag |
75+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
76+
| version |
77+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
78+
| header_length |
79+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
80+
| block_producer_time_nanos |
81+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
82+
| block_user_agent |
83+
| |
84+
⋮ +30 ⋮
85+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
86+
| ... future fields ... |
87+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
88+
```
89+
90+
- `block_header_flag: u64` will always be zero. The first 8 bytes of an entry
91+
batch are always a positive number (the number of entries in the batch), so
92+
this flag allows parsers to differentiate between a normal entry batch and one
93+
with a header prepended. Though not strictly necessary, this may facilitate
94+
parsing block data, and allows us to make the header optional if we ever need
95+
to.
96+
97+
- `version: u64` is a positive integer which changes anytime a change is made to
98+
the header. The initial version will be 1.
99+
100+
- `header_length: u64` is the length of the rest of the header in bytes (i.e.
101+
not including the `block_header_flag`, `version`, and `header_length` fields).
102+
103+
- `block_producer_time_nanos: u64` is a nanosecond UNIX timestamp representing
104+
the time when the block producer became leader and started constructing the
105+
block.
106+
107+
- `block_user_agent: [u8; 256]` is a string that provides identifying
108+
information about the block producer.
109+
110+
- `future fields` any other fields that are deemed necessary in the future may
111+
be added with a corresponding change to `version` / `header_length`. For
112+
example, SIMD
113+
[0298](https://github.com/solana-foundation/solana-improvement-documents/pull/298)
114+
proposes a field header, which could be added as a subsequent SIMD (or folded
115+
into this one).
116+
117+
### Header Field Specification
118+
119+
Header fields will be unilaterally populated by their respective block producer
120+
without any enforced constraint on their contents. This SIMD includes the
121+
following fields in the header
122+
123+
- `block_producer_time_nanos`: u64
124+
- `block_user_agent`: [u8; 256]
125+
126+
Because it is desirable to maintain cluster-wide diagnostics this SIMD provides
127+
a suggested format for the `block_user_agent` string which includes basic
128+
information about the block producer. This should be an UTF-8 encoded, null
129+
terminated string. The null character should terminate valid UTF-8 data. Any
130+
data following the null character is ignored by parsers and may contain
131+
arbitrary information. It is expected that all producers use this format,
132+
though this will not be enforced. Clients that choose to opt out of the
133+
suggested format should set the first byte of the field to 0 (i.e. the null
134+
character). The format is loosely based on HTTP `user-agent` header format
135+
specification:
136+
137+
```
138+
<product>/<product-version> <comment>
139+
```
140+
141+
The first entry will always be the software client.
142+
143+
```
144+
client/client_version <client_details>
145+
```
146+
147+
Options for `client` currently include:
148+
149+
- `agave`
150+
- `frankendancer`
151+
- `firedancer`
152+
153+
`client_version` should be consistent with the information stored on-chain (in
154+
`ConfigProgram`). Software forks (e.g. `jito-agave`) should put one of
155+
the 3 base clients and can specify details about the fork in the comment.
156+
157+
The comment should be in parentheses and contain a semicolon separated
158+
list of flags. A flag has an unrestricted format, but should represent a
159+
feature that is contained and enabled in the client it describes.
160+
161+
e.g.
162+
163+
```
164+
agave/v2.2.15 (jito; doublezero; some-mod/v1.2.3)
165+
```
166+
167+
Sometimes there may be software that coexists or runs alongside a validator
168+
client. For example, current client development aims to make the transaction
169+
scheduler modular, which would allow the transaction scheduler to be developed
170+
independently from the client codebase. Validator clients that use
171+
complementary software like this should add additional
172+
`<product>/<product-version> <comment>` entries in the user agent string.
173+
174+
For example:
175+
176+
```
177+
agave/v3.0.0 (doublezero) greedy-scheduler/v3 (mode:perf; another-flag)
178+
```
179+
180+
### RPC Protocol Changes
181+
182+
The `getBlock` RPC response will be extended to, optionally, include all header
183+
fields. The request will be extended with the `header` parameter, which lets
184+
the client signal that they want the header fields in the response. By default,
185+
header fields will be included in the response.
186+
187+
Sample Request Payload
188+
189+
```json
190+
{
191+
"jsonrpc": "2.0",
192+
"id": 1,
193+
"method": "getBlock",
194+
"params": [
195+
378967388,
196+
{
197+
"encoding": "json",
198+
"maxSupportedTransactionVersion": 0,
199+
"transactionDetails": "full",
200+
"rewards": false,
201+
"header": true
202+
}
203+
]
204+
}
205+
```
206+
207+
Sample Response Payload
208+
209+
```json
210+
{
211+
"jsonrpc": "2.0",
212+
"result": {
213+
"blockHeight": 428,
214+
"blockTime": null,
215+
"blockhash": "3Eq21vXNB5s86c62bVuUfTeaMif1N2kUqRPBmGRJhyTA",
216+
"parentSlot": 429,
217+
"previousBlockhash": "mfcyqEXB3DnHXki6KjjmZck6YjmZLvpAByy2fj4nh6B",
218+
"header": {
219+
"blockProducerTimeNanos": 1750176982899968023,
220+
"blockUserAgent": "agave/v3.0.0 (doublezero) greedy-scheduler/v3 (mode:perf; another-flag)",
221+
},
222+
"transactions": [
223+
{
224+
"meta": {
225+
"err": null,
226+
"fee": 5000,
227+
"innerInstructions": [],
228+
"logMessages": [],
229+
"postBalances": [499998932500, 26858640, 1, 1, 1],
230+
"postTokenBalances": [],
231+
"preBalances": [499998937500, 26858640, 1, 1, 1],
232+
"preTokenBalances": [],
233+
"rewards": null,
234+
"status": {
235+
"Ok": null
236+
}
237+
},
238+
"transaction": {
239+
"message": {
240+
"accountKeys": [
241+
"3UVYmECPPMZSCqWKfENfuoTv51fTDTWicX9xmBD2euKe",
242+
"AjozzgE83A3x1sHNUR64hfH7zaEBWeMaFuAN9kQgujrc",
243+
"SysvarS1otHashes111111111111111111111111111",
244+
"SysvarC1ock11111111111111111111111111111111",
245+
"Vote111111111111111111111111111111111111111"
246+
],
247+
"header": {
248+
"numReadonlySignedAccounts": 0,
249+
"numReadonlyUnsignedAccounts": 3,
250+
"numRequiredSignatures": 1
251+
},
252+
"instructions": [
253+
{
254+
"accounts": [1, 2, 3, 0],
255+
"data": "37u9WtQpcm6ULa3WRQHmj49EPs4if7o9f1jSRVZpm2dvihR9C8jY4NqEwXUbLwx15HBSNcP1",
256+
"programIdIndex": 4
257+
}
258+
],
259+
"recentBlockhash": "mfcyqEXB3DnHXki6KjjmZck6YjmZLvpAByy2fj4nh6B"
260+
},
261+
"signatures": [
262+
"2nBhEBYYvfaAe16UMNqRHre4YNSskvuYgx3M6E4JP1oDYvZEJHvoPzyUidNgNX5r9sTyN1J9UxtbCXy2rqYcuyuv"
263+
]
264+
}
265+
}
266+
]
267+
},
268+
"id": 1
269+
}
270+
```
271+
272+
## Alternatives Considered
273+
274+
- Do nothing
275+
- We can't estimate block time / duration with sufficient granularity. We
276+
won't be able to estimate at all when votes are changed in alpenglow.
277+
- We will continue to have an incomplete, ephemeral record of who produced
278+
blocks.
279+
- derive timestamp header field from consensus and enforce user agent format
280+
- This can and probably should be implemented as a future SIMD. Meanwhile,
281+
these fields are still useful since
282+
1. most of the cluster is expected to
283+
be honest, so monitoring tools may still use them for cluster-wide
284+
analytics and
285+
2. block producers still use these fields to self-monitor
286+
their performance.
287+
- Send block producer information via gossip instead
288+
- The information is short-lived and depends on physical network availability
289+
- Update this information in an on-chain account instead (e.g. ConfigProgram)
290+
- Same issue as above, the information is short-lived.
291+
292+
## Impact
293+
294+
This change will enable more reliable monitoring and benchmarking for operators
295+
and for the community. Clients and indexers will need to extend both in-memory
296+
and long-term block storage to be aware of the new columns added to the block
297+
header. The client rpc engine will need to change to support the new fields.
298+
299+
## Security Considerations
300+
301+
- The header fields are untrusted and purely informational. Tools that expose
302+
these fields to external users should clearly communicate their untrusted
303+
nature.
304+
305+
## Drawbacks
306+
307+
- No expected drawbacks beyond minimal resource overhead.
308+
309+
## Backwards Compatibility
310+
311+
- RPC requests for old slots should properly document and return a suitable
312+
default value (e.g. None).
313+
- Clients that don't implement this SIMD will reject new blocks because they
314+
will fail to parse the new header.
315+
- Because this header is mandatory, leaders that produce blocks without a
316+
header will skip, since the header is required.

0 commit comments

Comments
 (0)