-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[pkg/ottl]: Add Sort converter #34283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
07215d1
04366f5
e2b90e1
73ecb89
dab7c86
2e36574
d43e95f
447df3e
028744e
1eb08f3
a0d9791
d676f1a
4568ca0
d166048
524e546
8f2acd1
a13bdbc
8c0e371
1bb3d86
2894981
9c22e38
660f4fc
3325267
0a9b019
b2f8062
6e0e66d
f2145f3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Use this changelog template to create an entry for release notes. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: enhancement | ||
|
||
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver) | ||
component: pkg/ottl | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: Add `Sort` function to sort array to ascending order or descending order | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
issues: [34200] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: | ||
|
||
# If your change doesn't affect end users or the exported elements of any package, | ||
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label. | ||
# Optional: The change log or logs in which this entry should be included. | ||
# e.g. '[user]' or '[user, api]' | ||
# Include 'user' if the change is relevant to end users. | ||
# Include 'api' if there is a change to a library API. | ||
# Default: '[user]' | ||
change_logs: [user] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -446,6 +446,7 @@ Available Converters: | |
- [Seconds](#seconds) | ||
- [SHA1](#sha1) | ||
- [SHA256](#sha256) | ||
- [Sort](#sort) | ||
- [SpanID](#spanid) | ||
- [Split](#split) | ||
- [String](#string) | ||
|
@@ -1208,6 +1209,30 @@ Examples: | |
|
||
**Note:** According to the National Institute of Standards and Technology (NIST), SHA256 is no longer a recommended hash function. It should be avoided except when required for compatibility. New uses should prefer FNV whenever possible. | ||
|
||
### Sort | ||
|
||
`Sort(target, Optional[order])` | ||
|
||
The `Sort` Converter sorts the `target` array in ascending or descending order. | ||
|
||
`target` is a `pcommon.Slice` type field. `order` is a string that must be one of `asc` or `desc`. The default `order` is `asc`. | ||
|
||
If elements in `target` are | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are your thoughts on permanently converting the types of values in heterogeneous arrays vs. just converting the types for comparison? I think it would be fairly straightforward to only convert the types within the sort function and leave the actual type in the array. I can see arguments for both sides here, but want to make sure we're considering our options. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see changing the types permanently and sorting a list as two distinct intentions. Both are reasonable actions in data processing. Here, we provide better flexibility to users. If they want to change the types permanently, they can simply overwrite the attributes by using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense, thanks. If we see changing the types and sorting as separate actions, then shouldn't we keep the original types of the array elements when sorting? Right now we do:
When I think it would be possible to do:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see your point. Yes, preserving the types may be more intuitive. I thought keeping the sort result as a string to make it clear that the comparison is done as a string. It sounds good to maintain the original type. |
||
|
||
- Integers: Returns a sorted array of integers. | ||
- Doubles: Returns a sorted array of doubles. | ||
- Integers and doubles: Converts to doubles and returns a sorted array of doubles. | ||
- Strings: Returns a sorted array of strings. | ||
- Booleans: Converts to strings and returns a sorted array of strings. | ||
- Integers, doubles, booleans, and strings: Converts to strings and returns a sorted array of strings. | ||
- Other types: Returns the `target` unchanged. | ||
|
||
Examples: | ||
|
||
- `Sort(attributes["device.tags"])` | ||
|
||
- `Sort(attributes["device.tags"], "desc")` | ||
|
||
### SpanID | ||
|
||
`SpanID(bytes)` | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,219 @@ | ||
// Copyright The OpenTelemetry Authors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
package ottlfuncs // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/ottlfuncs" | ||
|
||
import ( | ||
"cmp" | ||
"context" | ||
"fmt" | ||
"slices" | ||
"strconv" | ||
|
||
"go.opentelemetry.io/collector/pdata/pcommon" | ||
|
||
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl" | ||
) | ||
|
||
const ( | ||
sortAsc = "asc" | ||
sortDesc = "desc" | ||
) | ||
|
||
type SortArguments[K any] struct { | ||
Target ottl.Getter[K] | ||
Order ottl.Optional[string] | ||
} | ||
|
||
func NewSortFactory[K any]() ottl.Factory[K] { | ||
return ottl.NewFactory("Sort", &SortArguments[K]{}, createSortFunction[K]) | ||
} | ||
|
||
func createSortFunction[K any](_ ottl.FunctionContext, oArgs ottl.Arguments) (ottl.ExprFunc[K], error) { | ||
args, ok := oArgs.(*SortArguments[K]) | ||
|
||
if !ok { | ||
return nil, fmt.Errorf("SortFactory args must be of type *SortArguments[K]") | ||
} | ||
|
||
order := sortAsc | ||
if !args.Order.IsEmpty() { | ||
o := args.Order.Get() | ||
switch o { | ||
case sortAsc, sortDesc: | ||
order = o | ||
default: | ||
return nil, fmt.Errorf("invalid arguments: %s. Order should be either \"%s\" or \"%s\"", o, sortAsc, sortDesc) | ||
} | ||
} | ||
|
||
return Sort(args.Target, order) | ||
} | ||
|
||
func Sort[K any](target ottl.Getter[K], order string) (ottl.ExprFunc[K], error) { | ||
kaisecheng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return func(ctx context.Context, tCtx K) (any, error) { | ||
val, err := target.Get(ctx, tCtx) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
if order != sortAsc && order != sortDesc { | ||
return nil, fmt.Errorf("invalid arguments: %s. Order should be either \"%s\" or \"%s\"", order, sortAsc, sortDesc) | ||
} | ||
TylerHelmuth marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
switch v := val.(type) { | ||
case pcommon.Slice: | ||
evan-bradley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return sortSlice(v, order) | ||
case []any: | ||
// handle Sort([1,2,3]) | ||
slice := pcommon.NewValueSlice().SetEmptySlice() | ||
if err := slice.FromRaw(v); err != nil { | ||
return v, nil | ||
} | ||
return sortSlice(slice, order) | ||
case []string: | ||
// handle value from Split() | ||
kaisecheng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
dup := makeCopy(v) | ||
return sortTypedSlice(dup, order), nil | ||
case []int64: | ||
dup := makeCopy(v) | ||
return sortTypedSlice(dup, order), nil | ||
case []float64: | ||
dup := makeCopy(v) | ||
return sortTypedSlice(dup, order), nil | ||
case []bool: | ||
var arr []string | ||
for _, b := range v { | ||
arr = append(arr, strconv.FormatBool(b)) | ||
} | ||
return sortTypedSlice(arr, order), nil | ||
default: | ||
return v, nil | ||
TylerHelmuth marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
}, nil | ||
} | ||
|
||
// sortSlice sorts a pcommon.Slice based on the specified order. | ||
// It gets the common type for all elements in the slice and converts all elements to this common type, creating a new copy | ||
// Parameters: | ||
// - slice: The pcommon.Slice to be sorted | ||
// - order: The sort order. "asc" for ascending, "desc" for descending | ||
// | ||
// Returns: | ||
// - A sorted slice as []any or the original pcommon.Slice | ||
// - An error if an unsupported type is encountered | ||
func sortSlice(slice pcommon.Slice, order string) (any, error) { | ||
length := slice.Len() | ||
if length == 0 { | ||
return slice, nil | ||
} | ||
|
||
commonType, ok := findCommonValueType(slice) | ||
if !ok { | ||
return slice, nil | ||
} | ||
|
||
switch commonType { | ||
case pcommon.ValueTypeInt: | ||
arr := makeTypedCopy(length, func(idx int) int64 { | ||
return slice.At(idx).Int() | ||
}) | ||
return sortTypedSlice(arr, order), nil | ||
case pcommon.ValueTypeDouble: | ||
arr := makeTypedCopy(length, func(idx int) float64 { | ||
s := slice.At(idx) | ||
if s.Type() == pcommon.ValueTypeInt { | ||
return float64(s.Int()) | ||
} | ||
|
||
return s.Double() | ||
}) | ||
return sortTypedSlice(arr, order), nil | ||
case pcommon.ValueTypeStr: | ||
arr := makeTypedCopy(length, func(idx int) string { | ||
return slice.At(idx).AsString() | ||
}) | ||
return sortTypedSlice(arr, order), nil | ||
default: | ||
return nil, fmt.Errorf("sort with unsupported type: '%T'", commonType) | ||
} | ||
} | ||
|
||
type TargetType interface { | ||
~int64 | ~float64 | ~string | ||
} | ||
|
||
// findCommonValueType determines the most appropriate common type for all elements in a pcommon.Slice. | ||
// It returns two values: | ||
// - A pcommon.ValueType representing the desired common type for all elements. | ||
// Mixed Numeric types return ValueTypeDouble. Integer type returns ValueTypeInt. Double type returns ValueTypeDouble. | ||
// String, Bool, Empty and mixed of the mentioned types return ValueTypeStr, as they require string conversion for comparison. | ||
// - A boolean indicating whether a common type could be determined (true) or not (false). | ||
// returns false for ValueTypeMap, ValueTypeSlice and ValueTypeBytes. They are unsupported types for sort. | ||
func findCommonValueType(slice pcommon.Slice) (pcommon.ValueType, bool) { | ||
length := slice.Len() | ||
if length == 0 { | ||
return pcommon.ValueTypeEmpty, false | ||
} | ||
|
||
wantType := slice.At(0).Type() | ||
wantStr := false | ||
wantDouble := false | ||
|
||
for i := 0; i < length; i++ { | ||
value := slice.At(i) | ||
currType := value.Type() | ||
|
||
switch currType { | ||
case pcommon.ValueTypeInt: | ||
if wantType == pcommon.ValueTypeDouble { | ||
wantDouble = true | ||
} | ||
case pcommon.ValueTypeDouble: | ||
if wantType == pcommon.ValueTypeInt { | ||
wantDouble = true | ||
} | ||
case pcommon.ValueTypeStr, pcommon.ValueTypeBool, pcommon.ValueTypeEmpty: | ||
wantStr = true | ||
default: | ||
return pcommon.ValueTypeEmpty, false | ||
} | ||
} | ||
|
||
if wantStr { | ||
wantType = pcommon.ValueTypeStr | ||
} else if wantDouble { | ||
wantType = pcommon.ValueTypeDouble | ||
} | ||
|
||
return wantType, true | ||
} | ||
|
||
func makeTypedCopy[T TargetType](length int, converter func(idx int) T) []T { | ||
var arr []T | ||
for i := 0; i < length; i++ { | ||
arr = append(arr, converter(i)) | ||
} | ||
return arr | ||
} | ||
|
||
func makeCopy[T TargetType](src []T) []T { | ||
dup := make([]T, len(src)) | ||
copy(dup, src) | ||
return dup | ||
} | ||
|
||
func sortTypedSlice[T TargetType](arr []T, order string) []T { | ||
if len(arr) == 0 { | ||
return arr | ||
} | ||
|
||
slices.SortFunc(arr, func(a, b T) int { | ||
if order == sortDesc { | ||
return cmp.Compare(b, a) | ||
} | ||
return cmp.Compare(a, b) | ||
}) | ||
|
||
return arr | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for a mixed-type slice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty complex function, I'd like to see more e2e tests that cover the different type scenarios it supports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added more e2e test covering unit types and mixed types