-
Notifications
You must be signed in to change notification settings - Fork 132
feat(go/adbc/driver/databricks): implement Databricks ADBC driver with comprehensive test suite #2998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
} | ||
} | ||
|
||
reader, rowsAffected, err := s.rowsToRecordReader(ctx, rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need figure out a way to avoid row to arrow convertion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can Databricks return Arrow directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears there is internally: https://github.com/databricks/databricks-sql-go/blob/main/internal/rows/arrowbased/arrowRecordIterator.go
If the driver could use these lower level facilities instead of just wrapping database/sql
, I think it would be much more compelling. Otherwise I agree with Matt that a generic adapter would make more sense if we're going to wrap database/sql
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I am trying to do so, but there is the go arrrow v12 and v18 version issue, I am trying to resolve, do you have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion would be to update the Databricks module to use arrow-go v18. Since we've split out to the separate repo instead of the monorepo major version updates are much less likely, and I try to avoid them as much as possible.
If you're concerned, you could expose an io.Reader
of an arrow IPC record batch stream (which is what we did for Snowflake)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the implementation of using io.Reader and now there is no row conversion anymore.
I'll give this a full review tomorrow, but it looks like you're wrapping something that uses the |
I can do that if possible, but likely we will need some extension, because seems the database/sql is not arrow based, in order to make this an performant driver, it's better to use arrow directly. maybe extend the database/sql to have arrow functionality. I am not a go expert, suggestion welcomed. |
Because |
thanks, I will later double check if we can use database/sql plus some interface defined in adbc repo together make this happen. after that, drivers for other database can just implement database/sql plus this interface to use this. |
We already have https://pkg.go.dev/github.com/apache/arrow-adbc/go/[email protected]/sqldriver which is a wrapper around the ADBC interface which will provide a |
so it has row to arrow conversion? |
Other way around, it does Arrow to row conversion. The use case is as an adapter on top of any ADBC driver to get a row oriented The preferred result here is still to have |
ah, I am mainly thinking a wrapper for any database/sql driver to acting as
a adbc driver, and with some extra interface to avoid arrow data conversion.
…On Fri, Jun 20, 2025 at 2:33 PM Matt Topol ***@***.***> wrote:
*zeroshade* left a comment (apache/arrow-adbc#2998)
<#2998 (comment)>
Other way around, it does Arrow to row conversion. The use case is as an
adapter on top of any ADBC driver to get a row oriented database/sql
interface so you only have to provide the Arrow-based API
—
Reply to this email directly, view it on GitHub
<#2998 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A2VX777BN5QELWIK4C4SYNL3ER4ZDAVCNFSM6AAAAAB7URP772VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSOJSHE4DIMZSGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
You could always have your driver implement the ADBC interfaces that are defined in the Alternately, you could add extra QueryContext functions that return Arrow streams and arrow schemas etc to the driver? |
Summary
This PR introduces a new Databricks ADBC driver for Go that provides
Arrow-native database connectivity to Databricks SQL warehouses. The driver is
built as a wrapper around the
databricks-sql-go
library and implements allrequired ADBC interfaces.
Changes
Core Implementation
driver.go
): Entry point with version trackingand configuration options
database.go
): Connection lifecycle managementwith comprehensive validation
connection.go
): Core connection implementationwith metadata operations
statement.go
): SQL query execution with Arrowresult conversion
Key Features
Database, Connection, and Statement interfaces
for efficient data processing
options (hostname, HTTP path, tokens, catalogs, schemas, timeouts)
handling
Test Organization
test/
subdirectory for betterorganization
databricks_test
package with properimports
Performance & Verification
taxi dataset
Code Quality
checks pass
compliant)
gofmt
Testing
The driver has been thoroughly tested with:
All tests pass and demonstrate full functionality for production use.
Breaking Changes
None - this is a new driver implementation.