Skip to content

Tracing PoC#49706

Draft
jpalvarezl wants to merge 3 commits into
mainfrom
jpalvarezl/genai-tracing-poc
Draft

Tracing PoC#49706
jpalvarezl wants to merge 3 commits into
mainfrom
jpalvarezl/genai-tracing-poc

Conversation

@jpalvarezl

Copy link
Copy Markdown
Member

No description provided.

jpalvarezl and others added 3 commits July 2, 2026 14:19
Proof of concept re-implementing agent tracing the azure-core way, modeled on
azure-ai-inference's ChatCompletionClientTracer:

- New implementation.telemetry.AgentsClientTracer emits OpenTelemetry GenAI
  semantic-convention spans for createAgentVersion (sync + async).
- Per-client Tracer built by AgentsClientBuilder from
  ClientOptions.getTracingOptions() via TracerProvider (no global static state,
  no public opt-in API).
- Content gated by AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED (off by default).
- Library version sourced from azure-ai-agents.properties (not hardcoded).
- Endpoint injected via constructor (immutable), consistent across clients.
- Unit tests use an in-memory OpenTelemetry SDK; assert span attributes, content
  gating, and error propagation without swallowing exceptions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Port the full tracing feature set from PR #49434 into the idiomatic per-client
architecture (implementation.telemetry), keeping features close to the original
while fixing the repo best-practice issues:

Telemetry engine (per-client, no global static):
- GenAiInstrumentation holds the azure-core Tracer + Meter (built by the builder
  from ClientOptions tracing/metrics options), content flag, host/port, and the
  GenAI duration/token histograms.
- GenAiTracingScope is a per-op span with the close()/nanoTime metric bugs fixed.
- GenAiMessageFormatter gates content via a parameter (formatToolCallOutput fixed).

Coverage:
- create_agent (prompt/hosted/workflow attributes, gen_ai.agent.workflow event,
  agent id/version enrichment) - sync + async.
- chat / invoke_agent response tracing - sync + streaming (TracedStreamIterable +
  ResponseAccumulator), with model/token/message/finish-reason/conversation attrs
  and gen_ai.workflow.action events.
- create_conversation span.
- Metrics: gen_ai.client.operation.duration and gen_ai.client.token.usage.
- Content gated by AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED (off by default).

Also: GenAiAgentTracingTest + GenAiMessageFormatterTest (14 tests, in-memory OTel,
no exception swallowing); TracingSample + CI-verified README snippet; CHANGELOG.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ResponsesAsyncClient: trace createAzureResponse (Mono) and
  createStreamingAzureResponse (Flux) via new GenAiResponseTracing async methods
  (Mono/Flux.usingWhen); add createConversation/deleteConversation.
- GenAiResponseTracing: add traceResponseAsync + traceStreamingResponseAsync,
  share request-attribute extraction (ResponseSpanParams/startResponseScope), and
  reuse recordResponseAttributes for both sync-streaming and async-streaming
  finalization (TracedStreamIterable simplified).
- Samples: port TracingConsoleSample and TracingAzureMonitorSample to the
  idiomatic API - no enableGenAiTracing()/disableGenAiTracing() opt-in; tracing
  activates from the configured OpenTelemetry. Added test-scope deps
  (opentelemetry-exporter-logging, opentelemetry-sdk-extension-autoconfigure,
  azure-monitor-opentelemetry-autoconfigure).
- TRACING_NOTES.md: document design decisions and open questions - toggle vs
  config-driven activation, customized-generated-methods vs traced* variants vs
  codegen-customization weaving, async HTTP-span parenting, gen_ai.system vs
  provider.name, metrics/response-path test gaps, and more.

Validated: compile (Java 8 + 21), 14 telemetry tests, 0 checkstyle violations,
samples test-compile.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant