-
Notifications
You must be signed in to change notification settings - Fork 154
Add take() aggregation function in PPL #949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
docs/user/ppl/cmd/stats.rst
Outdated
|
||
* field: mandatory. The field must be a text field. | ||
* size: optional number. The number of values should be returned. Default is 10. | ||
* from: optional number. The number of values should be skipped. Default is 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just realized from
doesn't make much sense without sort, will remove it for now
Codecov Report
@@ Coverage Diff @@
## 2.x #949 +/- ##
=============================================
- Coverage 95.21% 62.76% -32.45%
=============================================
Files 313 10 -303
Lines 8441 658 -7783
Branches 620 119 -501
=============================================
- Hits 8037 413 -7624
+ Misses 350 192 -158
+ Partials 54 53 -1
Flags with carried forward coverage won't be shown. Click here to find out more. Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Signed-off-by: Joshua Li <[email protected]>
Do you have a test for non-text fields? |
That should be responsibilities of function resolver and expression analyzer? i didn't modify those logic, i think UT already covers those. If it happens error will say
or
|
Signed-off-by: Joshua Li <[email protected]>
...earch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java
Show resolved
Hide resolved
|
||
Example:: | ||
|
||
os> source=accounts | stats take(firstname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you consider adding more examples with groupBy, size and may be patterns if possible?
or add a complex Example which covers all.
will you be including ITs in this PR or other PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add one with groupBy and ITs. I don't know if we should put patterns
here since this is the stats doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we push down to OS, what is the default sort applied? Will it be consistent with our in-memory implementation? |
|
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
...earch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java
Show resolved
Hide resolved
...a/org/opensearch/sql/opensearch/storage/script/aggregation/dsl/MetricAggregationBuilder.java
Show resolved
Hide resolved
Signed-off-by: Joshua Li <[email protected]>
Signed-off-by: Joshua Li <[email protected]>
Description
take(field)
returns some documents of a field per eachstats
response. It is implemented using top hits metrics aggregation when pushed down to opensearch. In memory it works similar tohead
command.A sample use case would be aggregating logs by a category,
take
allows user to see a sample log per each category:stats take(log_string, 1) by category
.This returns a list of patterns with the count and one sample message.
Currently
take
only supports text fields. it's not very useful on other types, and array implementation might need to be improved. Support for other types can be extended in the future if needed.Issues Resolved
Linking issue: opensearch-project/observability#1172
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.