-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ES][v2] Embed tagDotReplacement
in ToDBModel
#6946
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6946 +/- ##
==========================================
+ Coverage 95.95% 96.12% +0.17%
==========================================
Files 346 347 +1
Lines 20407 20378 -29
==========================================
+ Hits 19581 19588 +7
+ Misses 622 595 -27
+ Partials 204 195 -9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
} | ||
return dest | ||
} | ||
|
||
func attributeToDbTag(key string, attr pcommon.Value) dbmodel.KeyValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to tagAppender.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
Logs: spanEventsToDbSpanLogs(span.Events()), | ||
Process: process, | ||
} | ||
} | ||
|
||
func getDbSpanTags(span ptrace.Span, scope pcommon.InstrumentationScope) []dbmodel.KeyValue { | ||
var spanKindTag, statusCodeTag, statusMsgTag dbmodel.KeyValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason to do this is a trial to clean the code a bit! Also to embed tagDotReplacement
}, true | ||
} | ||
|
||
func getTagFromStatusCode(statusCode ptrace.StatusCode) (dbmodel.KeyValue, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This from L282-L347 is a part of tag appender now!
@@ -255,96 +219,6 @@ func spanEventsToDbSpanLogs(events ptrace.SpanEventSlice) []dbmodel.Log { | |||
return logs | |||
} | |||
|
|||
func getTagFromSpanKind(spanKind ptrace.SpanKind) (dbmodel.KeyValue, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also moved tagAppender.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
none of these changes are necessary or beneficial. Tag appender (bad name) should have just a single function - take a collection of already converted tags (converted in this translator) and materialize all/some of them to the top-level fields.
@yurishkuro This is a try to separate tag appending from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the example where snapshot testing would be very helpful, maybe spend a day to build those tests first.
@yurishkuro I am able to generate or reuse the db spans fixtures but I don't have any idea to serialize-deserialize ptrace from json fixtures. Is there anything provided by OTEL? Or we need to use txt fixtures which are used in goldendataset
|
Signed-off-by: Manik2708 <[email protected]>
Signed-off-by: Manik2708 <[email protected]>
Signed-off-by: Manik2708 <[email protected]>
Signed-off-by: Manik2708 <[email protected]>
expectedTd, err := unmarshaller.UnmarshalTraces(tracesData) | ||
require.NoError(t, err) | ||
dotReplacement := "#" | ||
toDb := NewToDBModel(tt.allTagsAsFields, tt.tagKeysAsFields, dotReplacement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently I have skipped roundtrip because tagDotReplacement
in FromDBModel
will be implemented in next PR!
Signed-off-by: Manik2708 <[email protected]>
Signed-off-by: Manik2708 <[email protected]>
@yurishkuro Please review this PR, after this PR, similar modification will be done in |
tracesData := loadTraces(t, 1) | ||
unmarshaller := ptrace.JSONUnmarshaler{} | ||
expectedTd, err := unmarshaller.UnmarshalTraces(tracesData) | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have a function called loadTraces
why doesn't it unmarshal and return ptrace model directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing this but I think it is better to return []bytes
rather than marshalling because we are asserting bytes not the objects. Asserting objects needs more complexity as we need to pass a copy of spans to to FromModel
because it might manipulate the span (in fact it is manupilating). So loadTraces
is a part of loadFixtures
, for which we need byte data not the objects.
internal/storage/v2/elasticsearch/tracestore/to_dbmodel_test.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Manik2708 <[email protected]>
@yurishkuro Can you please review this PR as it is a blocker for the next PR (embedding |
toDb := newToDBModel(false, nil, ".") | ||
modelSpan := toDb.spanToDbSpan(span, spanScope, dbmodel.Process{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this calling private methods instead of the main entry point?
"github.com/jaegertracing/jaeger/internal/storage/elasticsearch/dbmodel" | ||
) | ||
|
||
// tagAppender append tags to dbmodel KeyValue slice and tagsMap by replacing dots with tagDotReplacement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first, this comment is practically useless, it tells what the actual implementation code is doing (which I can see by looking at the code) instead of telling what business function it performs and why.
second, the struct is doing way more than that, which is why I asked you to write a definition in the comment. Its primary objective is to convert tags into top-level object fields in ES span, so that they get indexed more efficiently (that is what the definition is supposed to say). This objective has nothing to do with understanding all the different flavors of the tags like span kind, status code, etc. - all of that is meant to be handled by the transformer. In the v1 code the whole concept of "tag appender" used to be like a single function (it wasn't even a struct), so why is it necessary to mix it up with so much unrelated functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got your point on mixing span kind, status code etc but the difference in v1 and v2 is that there is more conversion logic present in v2. I also thought initially to create a method of toDBModel
but then we had to pass map[string]any
and []dbmodel.KeyValue
everytime when we call the method along with key and value. This is what is happening in v1 but in v2 we had to use this method more frequently, take an example:
tag := make(map[string]any)
tags := make([]dbmodel.KeyValue)
kindStr := convertToStringSpanKind(span.Kind())
if kindStr != "" {
t.appendTag(model.SpanKindKey, pcommon.NewStrValue(kindStr), tag, tags)
}
// Similar with other tags
return tag, tags
To reduce this redundant passing of parameter, I wrapped it inside a struct. We could move to the v1 way also, would require your suggestion here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code that serves different problems needs to be kept separately. You already have implementation of translating tags / attributes between OTLP and DB, which is part of the overall converter - logical organization, why change that? Separately there's an additional capability that can materialize nested tags into top-level tags in ES object, for improved indexing in ES. That's a separate functionality that doesn't need to know how the tags are translated between OTLP and DB, it just needs to apply after that translation and materialize some of the tags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can seperate the tag conversion and materialization once the confusion stated in #6946 (comment) is resolved! If we have to move by accepting any
value (which might lead to "1234" instead of 1234 when spans are converted to OTEL traces) then we can do it but if not (means we have to follow the v1 constraint) then we have to differentiate that is we should be materializing along with conversion otherwise every value in tag map will be a string or we have to do an exra conversion to convert them to their original type!
@@ -0,0 +1,82 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the convention in the fixtures is that es_01 corresponds to otel_01, es_02 to otel_02, etc. Don't break this convention by introducing arbitrary naming scheme. If you have two different flavors of ES span (e.g. one with materialized fields another without) then they should still be named es_01_{suffix}
// Returns slice of translated DB Spans and error if translation failed. | ||
func ToDBModel(td ptrace.Traces) []dbmodel.Span { | ||
func (t toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (t toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span { | |
func (t *toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span { |
func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) toDBModel { | ||
return toDBModel{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) toDBModel { | |
return toDBModel{ | |
func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) *toDBModel { | |
return &toDBModel{ |
} | ||
return dest | ||
} | ||
|
||
func attributeToDbTag(key string, attr pcommon.Value) dbmodel.KeyValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
@@ -255,96 +219,6 @@ func spanEventsToDbSpanLogs(events ptrace.SpanEventSlice) []dbmodel.Log { | |||
return logs | |||
} | |||
|
|||
func getTagFromSpanKind(spanKind ptrace.SpanKind) (dbmodel.KeyValue, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
none of these changes are necessary or beneficial. Tag appender (bad name) should have just a single function - take a collection of already converted tags (converted in this translator) and materialize all/some of them to the top-level fields.
#6946 (comment) @yurishkuro There is a problem in this approach, if we will firstly convert all the tags to db tag and then materialize them to top-level fields then the value of dbmodel.KeyValue is always put as string (this is what is being done in v1) whereas in tag map it can be anything. For example when 25 is saved in |
in the DB model we have
The Value here is not string, similar to |
@yurishkuro That exactly what even I thought but in v1, every value is string in snapshot tests, also please see this:
Every value which is stored in db tag is string but in tag map, it is like this:
So how should we proceed in v2? As we have to think of backward-compatibilty also as when converting db spans back to OTEL traces, we have to think whether the value is string or any! |
I don't know why v1 converter uses AsString(), I believe I even added a TODO asking about that specifically. But we don't have to blindly replicate v1 converter behavior, I think it should be capturing Value() regardless of where the tag is stored. It's especially important for numeric fields as storing raw value means ES queries can be made against such numeric field, e.g. computing some stats. The only price we'd pay for using Value() is that the reverse translation (db->otlp) will have to be able to deal with the stored value being either a raw value or a string. BTW, using Value() directly is not always correct because if the type is Binary it should be encoded somehow. There may be other limitations of using raw values, e.g. whole numbers in JS are limited to 53bits, so we already have some special handing for those when returning to UI to avoid losing precision. |
Which problem is this PR solving?
Description of the changes
tagDotReplacement
inToDBModel
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:npm run lint
andnpm run test