-
Notifications
You must be signed in to change notification settings - Fork 158
Support for OpenSearch alias type #3246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit 2f475af)
Signed-off-by: Heng Qian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - thanks !
opensearch/src/main/java/org/opensearch/sql/opensearch/data/type/OpenSearchAliasType.java
Show resolved
Hide resolved
# Conflicts: # integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java # integ-test/src/test/java/org/opensearch/sql/legacy/TestUtils.java # integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx!
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.x
# Create a new branch
git switch --create backport/backport-3246-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 1284c121e0abeef514b35941e5722d906515ff2b
# Push it to GitHub
git push --set-upstream origin backport/backport-3246-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.x Then, create a pull request where the |
Signed-off-by: Heng Qian <[email protected]> Signed-off-by: xinyual <[email protected]>
Description
Add Support for OpenSearch alias type, so PPL can query fields of alias type correctly.
Implementation:
OpenSearchAliasType
OpenSearchDataType::parseMapping
ReferenceExpression
. For now, only alias type will have different original path and type._source
inOpenSearchRequestBuilder::pushDownProjects
, use path instead of attr since they are different for alias type. The former one is more accurate since OpenSearch don't accept fields of alias type in_source
.e.g. After this PR:
Related Issues
Resolves #3069
Options for this feature
Option1(The PR for now):
Add compounded datatype named OpenSearchAliasType in TypeEnvironment, and resolve ident of alias field inner ReferenceExpression. OpenSearchAliasType will record both path and original basic datatype inner it.
Option2:
Add a new hashmap in IndexMapping to record the alias field mapping to its referred path, it may has structure like:
Map<String, String> aliasMapping
and also add a new hashmap in SymbleTable to record the alias field, it may has structure like:
Map<Namespace, NavigableMap<String, String>> aliasByNamespace
When resolving ident, we need to lookup aliasByNamespace first to identify if this field is an alias, then lookup the original tableByNamespace to reslve type. It makes more sense to not maintaining a wired data type of alias, but will have drawback of making this analyzing process more complex than option1
Option3:
Parsing the data type of alias fields to be the data type of its referred field directly.
But in OpenSearchIndexScan, it should be aware of the mapping of opensearch index, and then populating ExprValue with alias fields filled with the value of their referred fields. It aims to resolve the alias fields from the source data.
That’s said, since we’ve resolved the alias fields from the soruce, they are no longer alias fields in the following processing. Assuming we have update on its referred fields after scanning, the alias field won’t be udpated as its path has no relation with the value on updated path. The functionality may not align with the original purpose of opensearch alias type in some cases.
Option for moving to Calcite in future
In Calcite, there is a special data type named SingleColumnAliasRelDataType, which may satisfy our requirements. This RelDataType maintains single column but can resolve 2 kinds of idents with name in original or alias. But I’m afraid it cannot handle the corner cases that more than one alias field refer to the same path, which is rare but actually allowed in OpenSearch. Unless we can extend that RelDataType to remove the limitation and handle multiple fields in alias.
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.