Skip to content

Adding Array Support including Unit Tests #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 15, 2025

Conversation

akuzin1
Copy link
Contributor

@akuzin1 akuzin1 commented May 23, 2023

Description

Even though there is no dedicated array field type in OpenSearch, one can still pass an array of values into any field. However, currently there is no way to take advantage of that feature. Therefore, I added Array support to the driver, so that the user can call getArray(<column-index>) to retrieve the array object stored as a field from the returned ResultSet.

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Collaborator

@Yury-Fridlyand Yury-Fridlyand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small comments. I'm going to test this manually to verify.
For documentation purposes can you add to PR description screenshots how it looks with and without this fix? Please add you sample data and mapping to let others verify the test.
Thanks

@@ -61,15 +63,17 @@ public enum OpenSearchType {
STRING(JDBCType.VARCHAR, String.class, Integer.MAX_VALUE, 0, false),
IP(JDBCType.VARCHAR, String.class, 15, 0, false),
NESTED(JDBCType.STRUCT, null, 0, 0, false),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe nested should be mapped to array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why? I may not be following.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested is mapped to ARRAY in sql plugin, curious why it is different here?
OpenSearchDataType.java#L39

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sql plugin behavior always returns only the first element, doesn't actually treat it as an Array...
Screenshot 2023-06-02 at 2 07 35 PM

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bug of workbench?

@akuzin1
Copy link
Contributor Author

akuzin1 commented Jun 2, 2023

The sample data I used to populate OpenSearch Index:

POST sample_array_data/_doc
{
  "int_array": [1, 2, 3],
  "string_array": ["abc", "cde", "fgh"],
  "double_array": [1.25, 5.99, 12.32],
  "boolean_array": [true, false, true]
}

POST sample_array_data/_doc
{
  "int_array": [4, 5, 6],
  "string_array": ["qrs", "tuv", "xyz"],
  "double_array": [7.23, 10.50, 99.99],
  "boolean_array": [false, true, false]
}

Before:
Screenshot 2023-06-02 at 1 05 14 PM

java.lang.RuntimeException: java.sql.SQLFeatureNotSupportedException: Array is not supported
	at com.amazonaws.athena.connector.lambda.data.S3BlockSpiller.writeRows(S3BlockSpiller.java:196)
	at com.amazonaws.athena.connectors.jdbc.manager.JdbcRecordHandler.readWithConstraint(JdbcRecordHandler.java:185)
	at com.amazonaws.athena.connector.lambda.handlers.RecordHandler.doReadRecords(RecordHandler.java:197)
	at com.amazonaws.athena.connector.lambda.handlers.RecordHandler.doHandleRequest(RecordHandler.java:163)
	at com.amazonaws.athena.connector.lambda.handlers.CompositeHandler.handleRequest(CompositeHandler.java:147)
	at com.amazonaws.athena.connector.lambda.handlers.CompositeHandler.handleRequest(CompositeHandler.java:112)
Caused by: java.sql.SQLFeatureNotSupportedException: Array is not supported
	at org.opensearch.jdbc.ResultSetImpl.getArray(ResultSetImpl.java:1053)
	at com.amazonaws.athena.connectors.jdbc.manager.JdbcRecordHandler.lambda$makeFactory$1(JdbcRecordHandler.java:206)
	at com.amazonaws.athena.connector.lambda.data.writers.GeneratedRowWriter.writeRow(GeneratedRowWriter.java:119)
	at com.amazonaws.athena.connectors.jdbc.manager.JdbcRecordHandler.lambda$readWithConstraint$0(JdbcRecordHandler.java:185)
	at com.amazonaws.athena.connector.lambda.data.S3BlockSpiller.writeRows(S3BlockSpiller.java:193)
	... 5 more

After:
Screenshot 2023-06-02 at 12 32 31 PM

@@ -61,15 +63,17 @@ public enum OpenSearchType {
STRING(JDBCType.VARCHAR, String.class, Integer.MAX_VALUE, 0, false),
IP(JDBCType.VARCHAR, String.class, 15, 0, false),
NESTED(JDBCType.STRUCT, null, 0, 0, false),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bug of workbench?

@Yury-Fridlyand Yury-Fridlyand marked this pull request as draft June 7, 2023 20:25
@Yury-Fridlyand
Copy link
Collaborator

I suspended this PR, because it is blocked by opensearch-project/sql#1300. We need to implement this feature in SQL plugin prior to make any changes in JDBC.

Yury-Fridlyand added a commit to Bit-Quill/sql-jdbc that referenced this pull request Jun 8, 2023
Signed-off-by: Yury-Fridlyand <[email protected]>
@seankao-az
Copy link
Collaborator

Hi @akuzin1,

I noticed this draft PR has been open for a while without recent activity. I'm doing some repository maintenance and wanted to check if you're still interested in completing this work?

If you plan to continue working on this:

  • Please let us know your timeline
  • Update the PR with the latest main branch
  • Let us know if you need any help or guidance

If you no longer plan to work on this:

  • Feel free to close this PR
  • Or let us know and we can close it

If we don't hear back within 2 weeks, we'll close this PR to keep our PR list clean, but feel free to reopen it when you're ready to continue.

Thank you for your contribution!

@akuzin1
Copy link
Contributor Author

akuzin1 commented Apr 14, 2025

Hi, @seankao-az, the PR was iniatially "suspended, because it is blocked by opensearch-project/sql#1300. A feature needed to be implemented in the SQL plugin prior to making any changes in JDBC."
It appears that the issue has since been resolved and it'd be great if it can get reviewed once more? In which case I can address any comments and rebase as necessary. Thank you!

@seankao-az seankao-az marked this pull request as ready for review April 15, 2025 18:01
@Swiddis
Copy link
Collaborator

Swiddis commented Apr 15, 2025

Don't see any clear issues and it works locally, lgtm

@Swiddis Swiddis merged commit 9746f56 into opensearch-project:main Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants