[ES][v2] Embed `tagDotReplacement` in `ToDBModel` #6946

Manik2708 · 2025-03-30T13:58:03Z

Which problem is this PR solving?

Fixes a part of: Upgrade Storage Backends to V2 Storage API #6458

Description of the changes

Embed tagDotReplacement in ToDBModel

How was this change tested?

Unit Tests

Checklist

I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
I have signed all commits
I have added unit tests for the new functionality
I have run lint and test steps successfully
- for jaeger: make lint test
- for jaeger-ui: npm run lint and npm run test

codecov · 2025-03-30T14:05:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.12%. Comparing base (bc586e3) to head (3ad31dc).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6946      +/-   ##
==========================================
+ Coverage   95.95%   96.12%   +0.17%     
==========================================
  Files         346      347       +1     
  Lines       20407    20378      -29     
==========================================
+ Hits        19581    19588       +7     
+ Misses        622      595      -27     
+ Partials      204      195       -9

Flag	Coverage Δ
badger_v1	`10.53% <ø> (ø)`
badger_v2	`2.18% <ø> (ø)`
cassandra-4.x-v1-manual	`15.84% <ø> (ø)`
cassandra-4.x-v2-auto	`2.17% <ø> (ø)`
cassandra-4.x-v2-manual	`2.17% <ø> (ø)`
cassandra-5.x-v1-manual	`15.84% <ø> (ø)`
cassandra-5.x-v2-auto	`2.17% <ø> (ø)`
cassandra-5.x-v2-manual	`2.17% <ø> (ø)`
elasticsearch-6.x-v1	`20.76% <ø> (ø)`
elasticsearch-7.x-v1	`20.84% <ø> (ø)`
elasticsearch-8.x-v1	`21.02% <ø> (ø)`
elasticsearch-8.x-v2	`2.18% <ø> (-0.12%)`	⬇️
grpc_v1	`11.61% <ø> (ø)`
grpc_v2	`8.47% <ø> (ø)`
kafka-3.x-v1	`10.82% <ø> (ø)`
kafka-3.x-v2	`2.18% <ø> (ø)`
memory_v2	`2.18% <ø> (ø)`
opensearch-1.x-v1	`20.89% <ø> (ø)`
opensearch-2.x-v1	`20.89% <ø> (ø)`
opensearch-2.x-v2	`2.18% <ø> (ø)`
tailsampling-processor	`0.59% <ø> (ø)`
unittests	`94.91% <100.00%> (+0.16%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Manik2708 · 2025-03-30T14:07:29Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

 	}
 	return dest
 }

-func attributeToDbTag(key string, attr pcommon.Value) dbmodel.KeyValue {


Moved to tagAppender.go

Manik2708 · 2025-03-30T14:08:46Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

 		Logs:          spanEventsToDbSpanLogs(span.Events()),
 		Process:       process,
 	}
 }

-func getDbSpanTags(span ptrace.Span, scope pcommon.InstrumentationScope) []dbmodel.KeyValue {
-	var spanKindTag, statusCodeTag, statusMsgTag dbmodel.KeyValue


The reason to do this is a trial to clean the code a bit! Also to embed tagDotReplacement

Manik2708 · 2025-03-30T14:09:47Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

-	}, true
-}
-
-func getTagFromStatusCode(statusCode ptrace.StatusCode) (dbmodel.KeyValue, bool) {


This from L282-L347 is a part of tag appender now!

Manik2708 · 2025-03-30T14:10:02Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

@@ -255,96 +219,6 @@ func spanEventsToDbSpanLogs(events ptrace.SpanEventSlice) []dbmodel.Log {
 	return logs
 }

-func getTagFromSpanKind(spanKind ptrace.SpanKind) (dbmodel.KeyValue, bool) {


This is also moved tagAppender.go

none of these changes are necessary or beneficial. Tag appender (bad name) should have just a single function - take a collection of already converted tags (converted in this translator) and materialize all/some of them to the top-level fields.

Manik2708 · 2025-03-30T14:12:11Z

@yurishkuro This is a try to separate tag appending from the ToDBModel and embed tagDotReplacement in the ToDBModel. Please review!

internal/storage/v2/elasticsearch/tracestore/tag_appender.go

yurishkuro

this is the example where snapshot testing would be very helpful, maybe spend a day to build those tests first.

Manik2708 · 2025-03-31T06:29:32Z

this is the example where snapshot testing would be very helpful, maybe spend a day to build those tests first.

@yurishkuro I am able to generate or reuse the db spans fixtures but I don't have any idea to serialize-deserialize ptrace from json fixtures. Is there anything provided by OTEL? Or we need to use txt fixtures which are used in goldendataset
I have tested the already existing fixture in the following way:

func TestToDbModel_Fixtures(t *testing.T) {
	dbStr, err := os.ReadFile("../../../elasticsearch/dbmodel/fixtures/es_01.json")
	require.NoError(t, err)
	var span dbmodel.Span
	err = json.Unmarshal(dbStr, &span)
	require.NoError(t, err)
	td, err := FromDBModel([]dbmodel.Span{span})
	require.NoError(t, err)
	spans := ToDBModel(td)
	assert.Equal(t, span, spans[0])
}

Signed-off-by: Manik2708 <[email protected]>

Manik2708 · 2025-04-04T09:35:31Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel_test.go

+			expectedTd, err := unmarshaller.UnmarshalTraces(tracesData)
+			require.NoError(t, err)
+			dotReplacement := "#"
+			toDb := NewToDBModel(tt.allTagsAsFields, tt.tagKeysAsFields, dotReplacement)


Currently I have skipped roundtrip because tagDotReplacement in FromDBModel will be implemented in next PR!

Signed-off-by: Manik2708 <[email protected]>

Manik2708 · 2025-04-04T15:57:27Z

@yurishkuro Please review this PR, after this PR, similar modification will be done in FromDBModel and we will get rid of es_01.json

yurishkuro · 2025-04-05T15:36:54Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel_test.go

+			tracesData := loadTraces(t, 1)
+			unmarshaller := ptrace.JSONUnmarshaler{}
+			expectedTd, err := unmarshaller.UnmarshalTraces(tracesData)
+			require.NoError(t, err)


if you have a function called loadTraces why doesn't it unmarshal and return ptrace model directly?

I tried doing this but I think it is better to return []bytes rather than marshalling because we are asserting bytes not the objects. Asserting objects needs more complexity as we need to pass a copy of spans to to FromModel because it might manipulate the span (in fact it is manupilating). So loadTraces is a part of loadFixtures, for which we need byte data not the objects.

internal/storage/v2/elasticsearch/tracestore/to_dbmodel_test.go

Signed-off-by: Manik2708 <[email protected]>

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

Manik2708 · 2025-04-06T19:33:27Z

@yurishkuro Can you please review this PR as it is a blocker for the next PR (embedding tagDotReplacement in FromDBModel)

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

yurishkuro · 2025-04-06T20:07:31Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel_test.go

+	toDb := newToDBModel(false, nil, ".")
+	modelSpan := toDb.spanToDbSpan(span, spanScope, dbmodel.Process{})


why is this calling private methods instead of the main entry point?

yurishkuro · 2025-04-06T20:14:15Z

internal/storage/v2/elasticsearch/tracestore/tag_appender.go

+	"github.com/jaegertracing/jaeger/internal/storage/elasticsearch/dbmodel"
+)
+
+// tagAppender append tags to dbmodel KeyValue slice and tagsMap by replacing dots with tagDotReplacement


first, this comment is practically useless, it tells what the actual implementation code is doing (which I can see by looking at the code) instead of telling what business function it performs and why.

second, the struct is doing way more than that, which is why I asked you to write a definition in the comment. Its primary objective is to convert tags into top-level object fields in ES span, so that they get indexed more efficiently (that is what the definition is supposed to say). This objective has nothing to do with understanding all the different flavors of the tags like span kind, status code, etc. - all of that is meant to be handled by the transformer. In the v1 code the whole concept of "tag appender" used to be like a single function (it wasn't even a struct), so why is it necessary to mix it up with so much unrelated functionality?

Got your point on mixing span kind, status code etc but the difference in v1 and v2 is that there is more conversion logic present in v2. I also thought initially to create a method of toDBModel but then we had to pass map[string]any and []dbmodel.KeyValue everytime when we call the method along with key and value. This is what is happening in v1 but in v2 we had to use this method more frequently, take an example:

tag := make(map[string]any) tags := make([]dbmodel.KeyValue) kindStr := convertToStringSpanKind(span.Kind()) if kindStr != "" { t.appendTag(model.SpanKindKey, pcommon.NewStrValue(kindStr), tag, tags) } // Similar with other tags return tag, tags

To reduce this redundant passing of parameter, I wrapped it inside a struct. We could move to the v1 way also, would require your suggestion here!

Code that serves different problems needs to be kept separately. You already have implementation of translating tags / attributes between OTLP and DB, which is part of the overall converter - logical organization, why change that? Separately there's an additional capability that can materialize nested tags into top-level tags in ES object, for improved indexing in ES. That's a separate functionality that doesn't need to know how the tags are translated between OTLP and DB, it just needs to apply after that translation and materialize some of the tags

We can seperate the tag conversion and materialization once the confusion stated in #6946 (comment) is resolved! If we have to move by accepting any value (which might lead to "1234" instead of 1234 when spans are converted to OTEL traces) then we can do it but if not (means we have to follow the v1 constraint) then we have to differentiate that is we should be materializing along with conversion otherwise every value in tag map will be a string or we have to do an exra conversion to convert them to their original type!

yurishkuro · 2025-04-06T20:15:53Z

internal/storage/v2/elasticsearch/tracestore/fixtures/es_02.json

@@ -0,0 +1,82 @@
+{


the convention in the fixtures is that es_01 corresponds to otel_01, es_02 to otel_02, etc. Don't break this convention by introducing arbitrary naming scheme. If you have two different flavors of ES span (e.g. one with materialized fields another without) then they should still be named es_01_{suffix}

yurishkuro · 2025-04-06T20:17:00Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

 // Returns slice of translated DB Spans and error if translation failed.
-func ToDBModel(td ptrace.Traces) []dbmodel.Span {
+func (t toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span {


Suggested change

func (t toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span {

func (t *toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span {

yurishkuro · 2025-04-06T20:17:18Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

+func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) toDBModel {
+	return toDBModel{


Suggested change

func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) toDBModel {

return toDBModel{

func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) *toDBModel {

return &toDBModel{

yurishkuro · 2025-04-06T20:18:14Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

 	}
 	return dest
 }

-func attributeToDbTag(key string, attr pcommon.Value) dbmodel.KeyValue {


yurishkuro · 2025-04-06T20:19:53Z

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go

@@ -255,96 +219,6 @@ func spanEventsToDbSpanLogs(events ptrace.SpanEventSlice) []dbmodel.Log {
 	return logs
 }

-func getTagFromSpanKind(spanKind ptrace.SpanKind) (dbmodel.KeyValue, bool) {


none of these changes are necessary or beneficial. Tag appender (bad name) should have just a single function - take a collection of already converted tags (converted in this translator) and materialize all/some of them to the top-level fields.

Manik2708 · 2025-04-06T20:46:12Z

#6946 (comment) @yurishkuro There is a problem in this approach, if we will firstly convert all the tags to db tag and then materialize them to top-level fields then the value of dbmodel.KeyValue is always put as string (this is what is being done in v1) whereas in tag map it can be anything. For example when 25 is saved in dbmodel.KeyValue it is stored as "25" in tags but is stored as 25 in tag map (I did this here because we don't want to change the db level span in v2). If we will employ this approach then we will need to convert strings back to their values to put into tag map.

yurishkuro · 2025-04-06T20:52:13Z

in the DB model we have Tags []KeyValue and

type KeyValue struct {
        Key   string    `json:"key"`
        Type  ValueType `json:"type,omitempty"`
        Value any       `json:"value"`
}

The Value here is not string, similar to Tag map[string]any. So I don't see a discrepancy - materializing Tags -> Tag should be lossless.

Manik2708 · 2025-04-06T21:01:24Z

in the DB model we have Tags []KeyValue and
type KeyValue struct {
        Key   string    `json:"key"`
        Type  ValueType `json:"type,omitempty"`
        Value any       `json:"value"`
}
The Value here is not string, similar to Tag map[string]any. So I don't see a discrepancy - materializing Tags -> Tag should be lossless.

@yurishkuro That exactly what even I thought but in v1, every value is string in snapshot tests, also please see this:

jaeger/internal/storage/v1/elasticsearch/spanstore/from_domain.go

Line 125 in 81c9ed9

Value: kv.AsString(),

Every value which is stored in db tag is string but in tag map, it is like this:

jaeger/internal/storage/v1/elasticsearch/spanstore/from_domain.go

Line 86 in 81c9ed9

tagsMap[strings.ReplaceAll(kv.Key, ".", fd.tagDotReplacement)] = kv.Value()

So how should we proceed in v2? As we have to think of backward-compatibilty also as when converting db spans back to OTEL traces, we have to think whether the value is string or any!

yurishkuro · 2025-04-06T22:47:39Z

I don't know why v1 converter uses AsString(), I believe I even added a TODO asking about that specifically. But we don't have to blindly replicate v1 converter behavior, I think it should be capturing Value() regardless of where the tag is stored. It's especially important for numeric fields as storing raw value means ES queries can be made against such numeric field, e.g. computing some stats.

The only price we'd pay for using Value() is that the reverse translation (db->otlp) will have to be able to deal with the stored value being either a raw value or a string.

BTW, using Value() directly is not always correct because if the type is Binary it should be encoded somehow. There may be other limitations of using raw values, e.g. whole numbers in JS are limited to 53bits, so we already have some special handing for those when returning to UI to avoid losing precision.

Manik2708 requested a review from a team as a code owner March 30, 2025 13:58

Manik2708 requested a review from jkowall March 30, 2025 13:58

dosubot bot added storage/elasticsearch v2 labels Mar 30, 2025

Manik2708 commented Mar 30, 2025

View reviewed changes

yurishkuro reviewed Mar 30, 2025

View reviewed changes

internal/storage/v2/elasticsearch/tracestore/tag_appender.go Show resolved Hide resolved

yurishkuro reviewed Mar 30, 2025

View reviewed changes

Manik2708 added 2 commits April 4, 2025 12:41

conflicts

4bc50f1

Signed-off-by: Manik2708 <[email protected]>

snapshots

0da23a7

Signed-off-by: Manik2708 <[email protected]>

Manik2708 force-pushed the tag branch from ac32943 to 0da23a7 Compare April 4, 2025 08:19

Manik2708 added 2 commits April 4, 2025 13:52

docs

da0a958

Signed-off-by: Manik2708 <[email protected]>

cleanup

5bb4cab

Signed-off-by: Manik2708 <[email protected]>

Manik2708 requested a review from yurishkuro April 4, 2025 08:56

Manik2708 commented Apr 4, 2025

View reviewed changes

Manik2708 added 2 commits April 4, 2025 18:46

cleanup

b06591a

Signed-off-by: Manik2708 <[email protected]>

test fix

04a8089

Signed-off-by: Manik2708 <[email protected]>

yurishkuro reviewed Apr 5, 2025

View reviewed changes

internal/storage/v2/elasticsearch/tracestore/to_dbmodel_test.go Outdated Show resolved Hide resolved

Manik2708 added 2 commits April 6, 2025 02:19

cleanup

50e1d6b

Signed-off-by: Manik2708 <[email protected]>

Merge branch 'main' into tag

3ad31dc

Manik2708 requested a review from yurishkuro April 5, 2025 21:17

Manik2708 commented Apr 5, 2025

View reviewed changes

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go Show resolved Hide resolved

yurishkuro reviewed Apr 6, 2025

View reviewed changes

internal/storage/v2/elasticsearch/tracestore/to_dbmodel.go Show resolved Hide resolved

yurishkuro reviewed Apr 6, 2025

View reviewed changes

Manik2708 mentioned this pull request Apr 7, 2025

[ES][v2] Change the DB Tag value from string to any type #6994

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ES][v2] Embed `tagDotReplacement` in `ToDBModel` #6946

[ES][v2] Embed `tagDotReplacement` in `ToDBModel` #6946

Manik2708 commented Mar 30, 2025

codecov bot commented Mar 30, 2025 •

edited

Loading

Manik2708 Mar 30, 2025

yurishkuro Apr 6, 2025

Manik2708 Mar 30, 2025

Manik2708 Mar 30, 2025

Manik2708 Mar 30, 2025

yurishkuro Apr 6, 2025

Manik2708 commented Mar 30, 2025

yurishkuro left a comment

Manik2708 commented Mar 31, 2025 •

edited

Loading

Manik2708 Apr 4, 2025

Manik2708 commented Apr 4, 2025

yurishkuro Apr 5, 2025

Manik2708 Apr 5, 2025

Manik2708 commented Apr 6, 2025

yurishkuro Apr 6, 2025

yurishkuro Apr 6, 2025

Manik2708 Apr 6, 2025

yurishkuro Apr 6, 2025

Manik2708 Apr 6, 2025 •

edited

Loading

yurishkuro Apr 6, 2025

yurishkuro Apr 6, 2025

yurishkuro Apr 6, 2025

yurishkuro Apr 6, 2025

yurishkuro Apr 6, 2025

Manik2708 commented Apr 6, 2025

yurishkuro commented Apr 6, 2025

Manik2708 commented Apr 6, 2025

yurishkuro commented Apr 6, 2025

		toDb := newToDBModel(false, nil, ".")
		modelSpan := toDb.spanToDbSpan(span, spanScope, dbmodel.Process{})

	func (t toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span {
	func (t *toDBModel) convertToDBModel(td ptrace.Traces) []dbmodel.Span {

		func newToDBModel(allTagsAsObject bool, tagKeysAsFields map[string]bool, tagDotReplacement string) toDBModel {
		return toDBModel{

[ES][v2] Embed tagDotReplacement in ToDBModel #6946

Are you sure you want to change the base?

[ES][v2] Embed tagDotReplacement in ToDBModel #6946

Conversation

Manik2708 commented Mar 30, 2025

Which problem is this PR solving?

Description of the changes

How was this change tested?

Checklist

codecov bot commented Mar 30, 2025 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manik2708 commented Mar 30, 2025

yurishkuro left a comment

Choose a reason for hiding this comment

Manik2708 commented Mar 31, 2025 • edited Loading

Choose a reason for hiding this comment

Manik2708 commented Apr 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manik2708 commented Apr 6, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manik2708 Apr 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manik2708 commented Apr 6, 2025

yurishkuro commented Apr 6, 2025

Manik2708 commented Apr 6, 2025

yurishkuro commented Apr 6, 2025

[ES][v2] Embed `tagDotReplacement` in `ToDBModel` #6946

[ES][v2] Embed `tagDotReplacement` in `ToDBModel` #6946

codecov bot commented Mar 30, 2025 •

edited

Loading

Manik2708 commented Mar 31, 2025 •

edited

Loading

Manik2708 Apr 6, 2025 •

edited

Loading