fix(sdk): consistent labels for otel.bsp.dropped_spans metric#2108
fix(sdk): consistent labels for otel.bsp.dropped_spans metric#2108robbkidd wants to merge 2 commits intoopen-telemetry:mainfrom
Conversation
report_dropped_spans is called from multiple code paths, some of which pass a function name and some of which do not. The .compact on the labels hash silently drops the code.function key when nil, producing an inconsistent label signature that breaks metrics reporters like prometheus which require a stable key set. Demonstrates the problem reported in open-telemetry#2040
report_dropped_spans() is used in several of places and not all of them include a function name in the call. The .compact here was removing nils from the labels hash. This produces inconsistent label sets for the 'otel.bsp.dropped_spans' metric. And that makes metrics reporters like prometheus-client unhappy because they require a stable set of label names. We'll leave how to handle nil values to concrete implementations of MetricsReporter instead.
|
What is nil? Code function? Should the default be a blank string instead of nil in the method signature? Should the method be defensive and ensure all labels are valid? Otherwise we are relying on the metrics reporter to have to be defensive as well. |
Yes. There are some call-sites that do not pass in
I don't know. Our metrics reporter is off-spec, so the handling of nil vs blank string has historically been left to the implementer of a concrete reporter.
Valid according to whose rules? Because our current metrics reporter interface is off-spec, someone wanting to use it decides what to do about labels and validity depend on where those labels are getting sent.
Yes, this change places the onus of defense on the concrete implementation of a metrics reporter that the SDK does not (to my knowledge) yet supply. I'm currently thinking that these questions could be addressed in followup issues and PRs so that the poor experience for concrete implementations choosing to send to prometheus today is improved. |
Fixes #2040
report_dropped_spans()is used in several of places and not all of them include afunctionname in the call. The.compacton the labels hash withinreport_dropped_spans()was removingnils. This produces inconsistent label sets for the'otel.bsp.dropped_spans'metric. And that makes metrics reporters like prometheus-client unhappy because they require a stable set of label names.This change leaves how to handle
nil-value labels to concrete implementations ofMetricsReporterinstead.