Skip to content

Add data source ClickHouse #646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 3, 2024
Merged

Add data source ClickHouse #646

merged 3 commits into from
Jul 3, 2024

Conversation

grieve54706
Copy link
Contributor

@grieve54706 grieve54706 commented Jul 2, 2024

Description

Make POST /v2/ibis/{data_source}/query support new data source ClickHouse

Request body

  • host: string, required
  • port: integer, required
  • database: string, required
  • user: string, required
  • password: string, required
{
  "sql": "select * from \"Orders\" limit 1",
  "manifestStr": "eyJjYXRhbG9nIjoibXlfY2F0YWxvZyIsInNjaGVtYSI6Im15X3...",
  "connectionInfo": {
    "host": "localhost",
    "port": 1433,
    "database": "dbname",
    "user": "username",
    "password": "password"
  }
}

or

  • connectionUrl: string, required
{
  "sql": "select * from \"Orders\" limit 1",
  "manifestStr": "eyJjYXRhbG9nIjoibXlfY2F0YWxvZyIsInNjaGVtYSI6Im15X3...",
  "connectionInfo": {
    "connectionUrl": "clickhouse://username:password@localhost:1433/dbname"
  }
}

Response body

  • columns
  • data
  • dtypes: Type through the pandas
{
    "columns": [
        "o_orderkey",
        "o_custkey",
        "o_orderstatus",
        "o_totalprice",
        "o_orderdate",
        "o_orderpriority",
        "o_clerk",
        "o_shippriority",
        "o_comment"
    ],
    "data": [
        [
            1,
            370,
            "O",
            172799.49,
            820540800000,
            "5-LOW",
            "Clerk#000000951",
            0,
            "nstructions sleep furiously among "
        ]
    ],
    "dtypes": {
        "o_orderkey": "int32",
        "o_custkey": "int32",
        "o_orderstatus": "object",
        "o_totalprice": "object",
        "o_orderdate": "object",
        "o_orderpriority": "object",
        "o_clerk": "object",
        "o_shippriority": "int32",
        "o_comment": "object"
    }
}

Additional information

We use orjson instead of ujson in the Pandas to take more control.

Resolve #638 too

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grieve54706 Looks nice to me.

Comment on lines +50 to +57
json_obj = orjson.loads(
orjson.dumps(
data,
option=orjson.OPT_SERIALIZE_NUMPY
| orjson.OPT_PASSTHROUGH_DATETIME
| orjson.OPT_SERIALIZE_UUID,
default=default,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. I guess it can also fix #638?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add the keyword resolve to link #638.

assert response.status_code == 422
assert response.text is not None

def test_validate_with_unknown_rule(self, clickhouse: ClickHouseContainer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the test cases for each data source are duplicate. Maybe we can have an abstract testing function for different data source. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have different parameters about databases. They can not be one abstract function.

return ch


@pytest.mark.postgres
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct? I guess it should be clickhouse?

Suggested change
@pytest.mark.postgres
@pytest.mark.clickhouse

@goldmedal goldmedal merged commit a137bb5 into main Jul 3, 2024
4 checks passed
@goldmedal goldmedal deleted the feature/clickhouse branch July 3, 2024 03:19
@goldmedal
Copy link
Contributor

Thanks @grieve54706

grieve54706 added a commit that referenced this pull request Dec 13, 2024
* Add data source ClickHouse

* Make mysql support connection url

* Fix marker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query UUID of PostgreSQL failed
2 participants