Description
While implementing endpoints for schema and version retrieval, I noticed that the current specification lacks pagination. Since a schema registry can be large, pagination is essential to ensure efficient and scalable responses.
There are a few key questions regarding this.
Schemas Response.
Current specification
Currently, the response does not include pagination, making it inefficient for handling large datasets.
{
"namespace": "bioinformatics-pipeline",
"schemas": [
{
"schema_name": "sequencing-metadata",
"latest_released_version": "2.0.1",
"maintainer": [
"Fatima Al-Farsi",
"Miguel Santos"
],
"maturity_level": "trial_use"
}
]
}
Proposed Response
To maintain consistency across all API responses, I propose structuring responses with two top-level keys:
pagination
– Contains metadata about the response, such as the current page, page size, and total entries.results
– A list of schemas, ensuring uniformity with other paginated endpoints.
Additionally, in this design, the namespace key is included within each schema entry rather than at the top level. This approach keeps responses standardized while making them more explicit.
{
"pagination": {
"page": 0,
"page_size": 100,
"total": 2
},
"results": [
{
"namespace": "namespace1",
"schema_name": "bedmaker",
"description": "",
"maintainers": "Marko",
"lifecycle_stage": "",
"latest_released_version": "1.0.0",
"last_update_date": "2025-03-18T20:27:01.669912Z"
},
{
"namespace": "namespace1",
"schema_name": "pep",
"description": "",
"maintainers": "Donald",
"lifecycle_stage": "",
"latest_released_version": "2.1.0",
"last_update_date": "2025-03-18T20:27:01.602433Z"
}
]
}
This structure ensures that all endpoints follow a consistent format, making it easier to work with the API. Even though some information (like namespace) might be repeated, this redundancy improves clarity when working with large datasets.
Alternative Approach
A slightly more concise alternative avoids repeating namespace in each entry by grouping schemas under it. However, this format introduces an inconsistency with other endpoints that return paginated lists, making parsing less uniform across the API.
{
"pagination": {
"page": 0,
"page_size": 100,
"total": 2
},
"results": {
"namespace": "namespace1",
"schemas": [
{
"schema_name": "bedmaker",
"description": "",
"maintainers": "Marko",
"lifecycle_stage": "",
"latest_released_version": "1.0.0",
"last_update_date": "2025-03-18T20:27:01.669912Z"
},
{
"schema_name": "pep",
"description": "",
"maintainers": "Donald",
"lifecycle_stage": "",
"latest_released_version": "2.1.0",
"last_update_date": "2025-03-18T20:27:01.602433Z"
}
]
}
}
The Versions endpoint has a similar issue. If a solution is implemented for this, I believe the Versions endpoint should be addressed at the same time.
Would love to hear any thoughts on this.
Thank you!