feat(observability): introduce minimal tracing implementation#4105
feat(observability): introduce minimal tracing implementation#4105diegomarquezp wants to merge 16 commits intomainfrom
Conversation
This reverts commit b7b9e31.
Summary of ChangesHello @diegomarquezp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the observability capabilities of GAX by integrating a robust tracing system based on OpenTelemetry. The design emphasizes extensibility and clear separation of concerns, allowing GAX to record detailed RPC lifecycle events as spans and attributes. It introduces a mechanism to automatically enrich these traces with contextual information, providing deeper insights into client-server interactions. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new OpenTelemetry-based tracing system into GAX. Key changes include adding ApiTracerContext to provide endpoint information to tracers, enhancing EndpointContext to extract and store server addresses, and implementing TracingRecorder, TracingTracer, and TracingTracerFactory to integrate OpenTelemetry for operation and attempt spans. Review comments highlight the need to address thread-safety in TracingTracer by using ConcurrentHashMap for attributes, improve the robustness of EndpointContext's server address parsing using java.net.URL, remove a potentially problematic default implementation in TracingRecorder to enforce explicit parent span handling, standardize span naming conventions in TracingTracerFactory to align with OpenTelemetry, and correct the copyright year across several new files.
gax-java/gax/src/main/java/com/google/api/gax/tracing/TracingTracer.java
Outdated
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/tracing/TracingRecorder.java
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/tracing/OpenTelemetryTracingRecorder.java
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/tracing/TracingRecorder.java
Outdated
Show resolved
Hide resolved
gax-java/gax/src/main/java/com/google/api/gax/tracing/TracingTracerFactory.java
Show resolved
Hide resolved
gax-java/gax/src/test/java/com/google/api/gax/tracing/OpenTelemetryTracingRecorderTest.java
Show resolved
Hide resolved
gax-java/gax/src/test/java/com/google/api/gax/tracing/TracingTracerFactoryTest.java
Show resolved
Hide resolved
gax-java/gax/src/test/java/com/google/api/gax/tracing/TracingTracerTest.java
Show resolved
Hide resolved
java-showcase/gapic-showcase/src/test/java/com/google/showcase/v1beta1/it/ITOtelTracing.java
Show resolved
Hide resolved
|
|
| private final TracingRecorder tracingRecorder; | ||
|
|
||
| /** Mapping of client attributes that are set for every TracingTracer at operation level */ | ||
| private final Map<String, String> operationAttributes; |
There was a problem hiding this comment.
We exposed a map of attributes in MetricsTracerFactory because partner teams want to pass some client level attributes. In this case though, the TracingTracerFactory is supposed to be created by customers, which I don't think we want to allow them to pass additional attributes?
| public TracingTracer(TracingRecorder recorder, String operationSpanName, String attemptSpanName) { | ||
| this.recorder = recorder; | ||
| this.attemptSpanName = attemptSpanName; | ||
| this.operationAttributes = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
I don't think we have to create a ConcurrentHashMap here, each ApiTracer is supposed to be manipulated only by one request.
| return () -> {}; | ||
| } | ||
|
|
||
| interface SpanHandle { |
There was a problem hiding this comment.
Do we have to create a separate interface? Can this be flattened?
There was a problem hiding this comment.
Discussed offline. We still need a reference to the Span created earlier in the request. Rename this to something like GaxSpan that wraps Span from Opentelemetry.
| } | ||
|
|
||
| @Override | ||
| public SpanHandle startSpan(String name, Map<String, String> attributes) { |
There was a problem hiding this comment.
Are there any attributes specific to the start of Span? If not, I think we may want to make them a field of the tracer, instead of always passing in as a parameter.
|
|
||
| void recordError(Throwable error); | ||
|
|
||
| void setAttribute(String key, String value); |
There was a problem hiding this comment.
We probably don't need it at this moment. Use a field for client level attributes. Do not need to expose a setter if not needed.
| interface SpanHandle { | ||
| void end(); | ||
|
|
||
| void recordError(Throwable error); |
There was a problem hiding this comment.
Move it to a separate PR. Errors are not implemented yet.
| if (watchdogProvider != null && watchdogProvider.shouldAutoClose()) { | ||
| backgroundResources.add(watchdog); | ||
| } | ||
| ApiTracerContext apiTracerContext = |
There was a problem hiding this comment.
This can be a client level tracing context. We can initialize all client level attributes here and pass them to the tracer factory.
| public ApiTracer.Scope inScope(SpanHandle handle) { | ||
| if (handle instanceof OtelSpanHandle) { | ||
| Scope scope = ((OtelSpanHandle) handle).span.makeCurrent(); |
There was a problem hiding this comment.
Investigate more to understand what exactly does Scope do.





Summary
This PR introduces a new tracing mechanism in GAX that allows recording traces using OpenTelemetry. It provides a way of recording spans and attributes, following the existing
ApiTracerclass pattern with a few tracing-specific additions. The implementation is meant to be extensible to support other implementations.New Classes
TracingRecorder: An interface for recording spans and attributes; can be implemented by observability frameworks.OpenTelemetryTracingRecorder: An implementation ofTracingRecorderthat uses the OpenTelemetry API.TracingTracer: AnApiTracerimplementation that delegates span management to aTracingRecorder.TracingTracerFactory: A factory for creatingTracingTracerinstances.ApiTracerContext: A context object that carries information (likeEndpointContext) used to infer common attributes for all tracers.SpanHandle: A handle returned byTracingRecorderto manage the lifecycle of a specific span (ending it, recording errors, or setting attributes).Approach
Connecting Tracer with Recorder
The implementation aims to decouple
TracingTracerfromTracingRecorder. When a tracer starts an operation or an attempt, it requests aSpanHandlefrom the recorder. This handle allows the tracer to update the span (e.g., adding attributes or recording errors) to keepTracingTracerseparated from specific recorder implementations (like OpenTelemetry'sSpanobject).Attribute Inference via
ApiTracerContextTo provide a source of Span Attributes that are common to all operations, we introduced
ApiTracerContext. This context is passed to theApiTracerFactoryand contains information like theEndpointContext. It is operated byClientContext.Initially, only
EndpointContextis contained in this class and it's meant to obtain theserver.addressattribute.Integration Tests
A new integration test,
ITOtelTracing, was added to thejava-showcasemodule:server.addressandgcp.client.language).currentin the thread context of Gax.Note on coverage
Coverage of new code by java-showcase is below 80 due to:
EndpointContextnon-public code (parseServerAddress). This is tested in unit tests.TracingTracerfailed attempt/operation methods. These will be tested when introducing error attributes.Note on java-bigtable downstream check
Since SkipTrailersTest mocks the tracer factory, the
EndpointContextcall toapiTracerFactory.withContext()returns a null factory, causing a null pointer exception when building the client context.We expect the test to be adjusted with this change.
Confirmation in Cloud Trace