Enforce last-value-wins semantics in AttributesMap without performance regression by EvgeniiR · Pull Request #8548 · open-telemetry/opentelemetry-java

EvgeniiR · 2026-06-26T21:59:32Z

Fixes the Issue #7897

TLDR; Current AttributesMap does not fully compy with that part of spec:

“Setting an attribute with the same key as an existing attribute SHOULD overwrite the existing attribute’s value.”
https://opentelemetry.io/docs/specs/otel/trace/api/#set-attributes

Problem

AttributesMap extended HashMap<AttributeKey<?>, Object>, using AttributeKey as the map
key. Because AttributeKey.equals() includes the attribute type, two attributes with the same
name but different types (e.g. stringKey("http.method") and longKey("http.method")) were
stored as separate entries.
This violated the OpenTelemetry specification, which requires that
attribute name alone determines identity — last write wins regardless of type.

Solution

Correctness fix

Replace the HashMap<AttributeKey<?>, Object> backing store with a string-keyed map so that
put("http.method", String) followed by put("http.method", Long) results in exactly one entry
(the Long value), consuming only one capacity slot.

The first implementations were based on HashMap/LinkedHashMap (2nd performed better once forEach was included in the benchmarks), but they introduced another issue — the fixed AttributesMap performed ~40-80% worse than baseline.
Therefore, I started looking for a better solution that would preserve the required last-value-wins semantics without introducing a performance regression.

LinkedHashMap implementation can be observed in the previous commit — https://github.com/EvgeniiR/opentelemetry-java/blob/d7df58af76e693aa1fe897d2757e2bdb50ab9798/sdk/common/src/main/java/io/opentelemetry/sdk/common/internal/AttributesMap.java

Final solution is described below.

Performance optimization

Instead of LinkedHashMap<String, AttributeEntry>, use parallel arrays with an open-addressing
int[] hash table (linear probing, load factor ≤ 0.5):

int[]           hashTable   — slot → entryIndex+1 (0 = empty, JVM zero-init)
String[]        entryNames  — for hash lookup
AttributeKey[]  entryKeys   — for get() type check and forEach()
Object[]        entryValues — attribute values

forEach becomes a tight sequential array loop with no pointer chasing, directly benefiting the export.

Benchmark results (avgt ns/op, lower is better)

putThenForEach — N unique puts + 1 forEach (dominant production path)

n	baseline	LHM fix	PA	PA vs base	PA vs LHM
4	35.8	50.2	32.6	−9%	−35%
16	133.2	231.9	128.2	−4%	−45%
20	151.8	275.6	204.7	+35%*	−26%
32	292.6	486.2	297.0	+2%	−39%
128	1281.9	2053.3	1861.0	+45%	−9%

* This is a worst-case for the new implementation. Performance for spans with 16–32 attributes could be improved by raising the initial array size from 16 to 32, at the cost of extra memory per map. I'm not sure where the right tradeoff sits. Currently our implementation uses less memory than the baseline for most cases:

AttributesMapBenchmark — putThenForEach memory allocation (gc.alloc.rate.norm, B/op, lower is better)

n	baseline	PA fix	delta
4	272	144	−47%
16	800	408	−49%
20	928	992	+7%
32	1584	1136	−28%
128	6224	5104	−18%

PA matches the baseline at n=4 and n=16 while being spec-correct. Given that, I think that further performance optimizations(besides may be changing init size) are out of scope of this PR.

…ixes open-telemetry#7897) AttributesMap previously extended HashMap<AttributeKey<?>, Object>, where AttributeKey.equals() includes the AttributeType. This caused attributes with the same string name but different types to coexist as separate entries, violating the OTel spec last-value-wins rule. Replace the HashMap backing with LinkedHashMap<String, AttributeEntry> keyed by raw attribute name. Overwrites with a different type now update the existing entry in place, so size() stays correct and capacity limits are not consumed. Also eliminates the double hash-probe in put() (containsKey + get → single get).

…n AttributesMap

codecov · 2026-06-26T22:12:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.99%. Comparing base (8f41449) to head (5615eac).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #8548      +/-   ##
============================================
+ Coverage     90.96%   90.99%   +0.03%     
- Complexity    10206    10227      +21     
============================================
  Files          1013     1013              
  Lines         27166    27227      +61     
  Branches       3182     3190       +8     
============================================
+ Hits          24712    24776      +64     
+ Misses         1730     1729       -1     
+ Partials        724      722       -2

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

EvgeniiR · 2026-06-26T22:15:28Z

Benchmark results ... putThenForEach — N unique puts + 1 forEach

This PR also includes benchmarks for individual read/write operations (uniqueKeys, getHit, getTypeMiss, forEachAll). The project already has SpanRecordBenchmark and FillSpanBenchmark that cover the write path at the span level. I will publish some additional benchmarks under cut, if for some reason you want to check them out too.

More benchmarks

AttributesMapBenchmark — write (avg ns/op, lower is better)

Benchmark	n	baseline	LHM fix	PA	PA vs base	PA vs LHM
uniqueKeys	4	41.8	48.9	38.2	−3.6 (−9%)	−10.7 (−22%)
uniqueKeys	16	134.9	231.0	134.3	−0.6 (0%)	−96.7 (−42%)
uniqueKeys	128	1309.7	2199.7	1950.2	+640.5 (+49%)	−249.5 (−11%)
sameKeySameType	4	46.1	33.7	35.0	−11.1 (−24%)	+1.3 (+4%)
sameKeySameType	16	164.5	105.4	124.6	−39.9 (−24%)	+19.2 (+18%)
sameKeySameType	128	1320.7	659.1	862.5	−458.2 (−35%)	+203.4 (+31%)
sameKeyDifferentType	4	32.1	42.9	42.9	+10.8 (+34%)	0.0 (0%)
sameKeyDifferentType	16	181.6	146.5	156.6	−25.0 (−14%)	+10.1 (+7%)
sameKeyDifferentType	128	1407.2	1073.0	1183.7	−223.5 (−16%)	+110.7 (+10%)
mixedUniqueAndOverwrite	4	32.1	43.2	44.4	+12.3 (+38%)	+1.2 (+3%)
mixedUniqueAndOverwrite	16	134.0	155.7	154.5	+20.5 (+15%)	−1.2 (−1%)
mixedUniqueAndOverwrite	128	1400.7	1531.9	1721.5	+320.8 (+23%)	+189.6 (+12%)

AttributesMapBenchmark — read (avg ns/op, lower is better)

Benchmark	n	baseline	LHM fix	PA	PA vs base	PA vs LHM
getHit	4	10.9	15.8	18.1	+7.2 (+66%)	+2.3 (+15%)
getHit	16	41.7	58.3	73.7	+32.0 (+77%)	+15.4 (+26%)
getHit	128	350.6	476.6	916.9	+566.3 (+161%)	+440.3 (+92%)
getTypeMiss	4	8.7	23.9	24.7	+16.0 (+184%)	+0.8 (+3%)
getTypeMiss	16	35.0	96.9	97.3	+62.3 (+178%)	+0.4 (+0%)
getTypeMiss	128	283.3	856.9	1136.7	+853.4 (+301%)	+279.8 (+33%)
forEachAll	4	11.1	5.8	2.9	−8.2 (−74%)	−2.9 (−50%)
forEachAll	16	26.0	25.3	8.1	−17.9 (−69%)	−17.2 (−68%)
forEachAll	128	185.5	229.9	46.8	−138.7 (−75%)	−183.1 (−80%)

FillSpanBenchmark (ops/ms, higher is better)

Benchmark	baseline	LHM fix	PA	PA vs base	PA vs LHM
setFourAttributes	6160.0	5589.7	5197.7	−962.3 (−16%)	−392.0 (−7%)

SpanRecordBenchmark (ops/s, higher is better)

threads	SpanSize	baseline	LHM fix	PA	PA vs base	PA vs LHM
1	SMALL	6 043 137	6 079 646	5 334 831	−708 306 (−12%)	−744 815 (−12%)
1	MEDIUM	658 387	650 851	574 200	−84 187 (−13%)	−76 651 (−12%)
1	LARGE	65 240	62 313	64 207	−1 033 (−2%)	+1 894 (+3%)
4	SMALL	8 785 156	9 009 770	8 454 487	−330 669 (−4%)	−555 283 (−6%)
4	MEDIUM	1 535 077	1 697 175	1 463 172	−71 905 (−5%)	−234 003 (−14%)
4	LARGE	165 137	175 858	165 824	+687 (+0.4%)	−10 034 (−6%)

…exOutOfBounds` if someone put hundreds of millions elements into AttributesMap

jack-berg · 2026-07-01T19:36:52Z

This violated the OpenTelemetry specification, which requires that
attribute name alone determines identity — last write wins regardless of type.

Can you provide a link the portion of the spec you're referring to? Thanks.

EvgeniiR · 2026-07-01T20:18:33Z

@jack-berg Hi! I'm referring to this part of spec:

https://opentelemetry.io/docs/specs/otel/trace/api/#set-attributes

“Setting an attribute with the same key as an existing attribute SHOULD overwrite the existing attribute’s value.”

I'm not sure if you're asking because the PR description doesn't provide enough context, or if you think I might be misinterpreting the spec and solving a problem that doesn't actually exist. Happy to clarify either way.

I originally found this through the linked issue. After seeing unfinished implementation attempt and corresponding discussion, I decided to try and finish it.

Happy to hear your thoughts, thanks!

jack-berg · 2026-07-01T20:27:06Z

I'm not sure if you're asking because the PR description doesn't provide enough context, or if you think I might be misinterpreting the spec and solving a problem that doesn't actually exist. Happy to clarify either way.

The spec is big and occasionally contradictory. Always good to have a reference!

I originally found this through the linked issue.

Missing the linked issue

EvgeniiR · 2026-07-01T20:30:16Z

Missing the linked issue

That one (from the PR description):

Fixes #7897

Also previous implementation attempt from another developer — #8420

jack-berg · 2026-07-02T21:38:54Z

Hey so I've been thinking about this. AttributesMap exists because the default implementation of Attributes / AttributesBuilder is not limits aware. I think that the fact that its backed by HashMap is a function of convenience: we need a simple implementation that we can apply limits too as its being built up. I think performance is important, and a Map based implementation also has better put performance than the default array based implementations ImmutableKeyValuePairs / ArrayBackedAttributesBuilder. But I think this is coincidence, since I've never heard us telling people "prefer using Span.setAttribute because it uses a more performant map based implementation".

I've never loved the fact that AttributesMap exists. I'd rather have one implementation. And I think your changes to AttributesMap reinforces this because while it still tries to do some map things for performance, other parts of it are starting to look more like ImmutableKeyValuePairs / ArrayBackedAttributesBuilder. And so its got me thinking about whether we can evolve ImmutableKeyValuePairs / ArrayBackedAttributesBuilder to meet the limits requirements and rip out AttributesMap altogether.

I've got two prototypes on how this could work, besides yours which I'll call "Approach 1":

Approach 2: Make the internal classes ImmutableKeyValuePairs / ArrayBackedAttributesBuilder limits aware, and use gradle build tooling to make a copy of them into sdk/common. Public API surface area stays the same, and we have one source of truth for an Attributes implementation, but we publish two copies of it.
Approach 3: Extend opentelemetry-api with capabilities to enforce limits directly in AttributesBuilder: Attributes.builder(AttributeLimits). Delete AttributesMap and leverage new api enforced attribute limits in its place.

Relative strengths / weaknesses of the different approaches:

Dimension	PR #8548	Approach 2 (source copy)	Approach 3 (public API)
Public API added	None (patch)	None (patch)	`AttributeLimits` + `AttributeLimitsBuilder` + one method on `Attributes`
Code shape	Bespoke ~200-line hash table in `sdk/common/internal`. Self-contained.	One canonical class in `api.common` implementing both `AttributesBuilder` and `Attributes`, plus SDK-only helper methods. Duplicated onto classpath via Gradle Copy task.	One class in `api.common` implementing only `AttributesBuilder`. `AttributesMap` deleted. No build magic.
Typical spans (SMALL/MEDIUM, 0-10 attrs)	As submitted: −12 to −13% on `SpanRecord` (likely fixable)	Within noise (~0 to −6%)	Within noise (~0 to −6%)
Large spans (100 attrs)	−2% (O(1) put wins)	−36% (O(n^2) linear-scan dedup, or more memory for additional tracking that avoids the O(n^2))	−37% (O(n^2) linear-scan dedup, or more memory for additional tracking that avoids the O(n^2))
`forEach` throughput	Big win: −50% to −80% latency (tight array loop)	~baseline	~baseline
User benefit / extensibility	None (SDK-internal)	None (SDK-internal). Future limit fields require touching the copy source + filter awareness.	Users can create bounded builders themselves. Future limit fields added non-breakingly via AutoValue pattern.

Currently I lean towards Approach 3. I recognize that it probably seems somewhat orthogonal to the task you set out to solve. Like, isn't extending Attributes / AttributesBuilder a different scope / goal than making the existing SdkSpan, SdkLogRecordBuilder follow the spec W.R.T. "last write wins"? And maybe we do end up treating them separately. But the appeal of maintaining a separate Attributes implementation in AttributesMap goes away when its no longer a thin wrapper over HashMap, but rather a dedicated data structure with various low level nuances.

jack-berg · 2026-07-02T21:39:15Z

@open-telemetry/java-approvers PTAL at my message above and let me know if you have thoughts.

EvgeniiR added 2 commits June 26, 2026 23:41

perf: replace LinkedHashMap with parallel-array open-addressing map i…

c63f8d8

…n AttributesMap

EvgeniiR requested a review from a team as a code owner June 26, 2026 21:59

opentelemetry-pr-dashboard Bot mentioned this pull request Jun 26, 2026

Pull Request Dashboard #8439

Open

Add test for equals()

6442ca1

EvgeniiR changed the title ~~Enforce last value wins semantics clean in AttributesMap~~ Enforce last-value-wins semantics in AttributesMap without performance regression Jun 26, 2026

EvgeniiR added 3 commits June 27, 2026 01:12

Fix codestyle in test

7373807

Remove extra guard for int overflow - it is ok to just fail with `Ind…

1f94821

…exOutOfBounds` if someone put hundreds of millions elements into AttributesMap

A few small optimizations for AttributesMap

5615eac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enforce last-value-wins semantics in AttributesMap without performance regression#8548

Enforce last-value-wins semantics in AttributesMap without performance regression#8548
EvgeniiR wants to merge 6 commits into
open-telemetry:mainfrom
EvgeniiR:respect-last-value-wins-semantics-clean

EvgeniiR commented Jun 26, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

EvgeniiR commented Jun 26, 2026

AttributesMapBenchmark — write (avg ns/op, lower is better)

AttributesMapBenchmark — read (avg ns/op, lower is better)

FillSpanBenchmark (ops/ms, higher is better)

SpanRecordBenchmark (ops/s, higher is better)

Uh oh!

jack-berg commented Jul 1, 2026

Uh oh!

EvgeniiR commented Jul 1, 2026

Uh oh!

jack-berg commented Jul 1, 2026

Uh oh!

EvgeniiR commented Jul 1, 2026

Uh oh!

jack-berg commented Jul 2, 2026

Uh oh!

jack-berg commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

EvgeniiR commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Correctness fix

Performance optimization

Benchmark results (avgt ns/op, lower is better)

putThenForEach — N unique puts + 1 forEach (dominant production path)

AttributesMapBenchmark — putThenForEach memory allocation (gc.alloc.rate.norm, B/op, lower is better)

Uh oh!

codecov Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

EvgeniiR commented Jun 26, 2026

AttributesMapBenchmark — write (avg ns/op, lower is better)

AttributesMapBenchmark — read (avg ns/op, lower is better)

FillSpanBenchmark (ops/ms, higher is better)

SpanRecordBenchmark (ops/s, higher is better)

Uh oh!

jack-berg commented Jul 1, 2026

Uh oh!

EvgeniiR commented Jul 1, 2026

Uh oh!

jack-berg commented Jul 1, 2026

Uh oh!

EvgeniiR commented Jul 1, 2026

Uh oh!

jack-berg commented Jul 2, 2026

Uh oh!

jack-berg commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EvgeniiR commented Jun 26, 2026 •

edited

Loading

codecov Bot commented Jun 26, 2026 •

edited

Loading