Skip to content

Add pagination #11

Open
Open
@khoroshevskyi

Description

@khoroshevskyi

While implementing endpoints for schema and version retrieval, I noticed that the current specification lacks pagination. Since a schema registry can be large, pagination is essential to ensure efficient and scalable responses.

There are a few key questions regarding this.

Schemas Response.

Current specification

Currently, the response does not include pagination, making it inefficient for handling large datasets.

{
    "namespace": "bioinformatics-pipeline",
    "schemas": [
        {
            "schema_name": "sequencing-metadata",
            "latest_released_version": "2.0.1",
            "maintainer": [
                "Fatima Al-Farsi",
                "Miguel Santos"
            ],
            "maturity_level": "trial_use"
        }
    ]
}

Proposed Response

To maintain consistency across all API responses, I propose structuring responses with two top-level keys:

  1. pagination – Contains metadata about the response, such as the current page, page size, and total entries.
  2. results – A list of schemas, ensuring uniformity with other paginated endpoints.

Additionally, in this design, the namespace key is included within each schema entry rather than at the top level. This approach keeps responses standardized while making them more explicit.

{
  "pagination": {
    "page": 0,
    "page_size": 100,
    "total": 2
  },
  "results": [
    {
      "namespace": "namespace1",
      "schema_name": "bedmaker",
      "description": "",
      "maintainers": "Marko",
      "lifecycle_stage": "",
      "latest_released_version": "1.0.0",
      "last_update_date": "2025-03-18T20:27:01.669912Z"
    },
    {
      "namespace": "namespace1",
      "schema_name": "pep",
      "description": "",
      "maintainers": "Donald",
      "lifecycle_stage": "",
      "latest_released_version": "2.1.0",
      "last_update_date": "2025-03-18T20:27:01.602433Z"
    }
  ]
}

This structure ensures that all endpoints follow a consistent format, making it easier to work with the API. Even though some information (like namespace) might be repeated, this redundancy improves clarity when working with large datasets.

Alternative Approach

A slightly more concise alternative avoids repeating namespace in each entry by grouping schemas under it. However, this format introduces an inconsistency with other endpoints that return paginated lists, making parsing less uniform across the API.

{
  "pagination": {
    "page": 0,
    "page_size": 100,
    "total": 2
  },
  "results": {
    "namespace": "namespace1",
    "schemas": [
      {
      "schema_name": "bedmaker",
      "description": "",
      "maintainers": "Marko",
      "lifecycle_stage": "",
      "latest_released_version": "1.0.0",
      "last_update_date": "2025-03-18T20:27:01.669912Z"
    },
    {
      "schema_name": "pep",
      "description": "",
      "maintainers": "Donald",
      "lifecycle_stage": "",
      "latest_released_version": "2.1.0",
      "last_update_date": "2025-03-18T20:27:01.602433Z"
    }
  ]
 }
}

The Versions endpoint has a similar issue. If a solution is implemented for this, I believe the Versions endpoint should be addressed at the same time.

Would love to hear any thoughts on this.
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions