feat(chalice): Add span streaming support to Chalice integration#6503
Conversation
Add span streaming support behind the trace_lifecycle=stream experiment flag. When enabled, start a segment span with Lambda/FaaS attributes for both HTTP view functions and event source handlers instead of setting a transaction name on the scope. Also adds aws_lambda to CLOUD_PLATFORM constants. Fixes PY-2312 Fixes #6010
|
bugbot run |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 208cf35. Configure here.
| invoked_arn = aws_context.invoked_function_arn | ||
| split_invoked_arn = invoked_arn.split(":") | ||
| aws_region = split_invoked_arn[3] if len(split_invoked_arn) > 3 else "unknown" |
There was a problem hiding this comment.
Although AWS sets this value and it's unlikely to be malformed, these extra guards were added to ensure that if it were, we don't crash the user's application
Codecov Results 📊✅ 90452 passed | ⏭️ 6031 skipped | Total: 96483 | Pass Rate: 93.75% | Execution Time: 317m 35s 📊 Comparison with Base Branch
All tests are passing successfully. ✅ Patch coverage is 92.86%. Project has 2418 uncovered lines. Files with missing lines (1)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 89.75% 89.75% —%
==========================================
Files 192 192 —
Lines 23583 23580 -3
Branches 8116 8108 -8
==========================================
+ Hits 21165 21162 -3
- Misses 2418 2418 —
- Partials 1334 1335 +1Generated by Codecov Action |
| try: | ||
| return ChaliceEventSourceHandler.__call__(self, event, context) | ||
| except Exception: | ||
| exc_info = sys.exc_info() | ||
| event, hint = event_from_exception( | ||
| exc_info, | ||
| client_options=client.options, | ||
| mechanism={"type": "chalice", "handled": False}, | ||
|
|
||
| if has_span_streaming_enabled(client.options): | ||
| span = sentry_sdk.traces.start_span( | ||
| name=context.function_name, | ||
| parent_span=None, | ||
| attributes=_get_lambda_span_attributes(context), | ||
| ) | ||
| sentry_sdk.capture_event(event, hint=hint) | ||
| client.flush() | ||
| reraise(*exc_info) |
There was a problem hiding this comment.
I don't think we need to do anything here, just leave as is -- this code doesn't do any tracing so we don't need to change it
The set of spans we're creating should be the same as in the legacy path, we don't want any new spans when streaming
| span = sentry_sdk.traces.start_span( | ||
| name=aws_context.function_name, | ||
| parent_span=None, | ||
| attributes={ | ||
| **_get_lambda_span_attributes(aws_context), | ||
| **header_attrs, | ||
| **additional_attrs, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
This is also a new span compared to the legacy path -- please remove
As I understand, this integration is built on top of the AWS Lambda integration similar to e.g. Flask and WSGI? In that case, the pattern is usually that the parent integration creates the transaction/segment encompassing the whole request-response cycle, and the child integration then augments it with info specific to the child integration (e.g. better transaction name and source). The child integration may also create its own spans (e.g. for middlewares, DB queries etc.), but it shouldn't double-instrument the same thing as the top level segment
All of this is to say -- if we don't explicitly create a span in the legacy path, we shouldn't do it in span streaming either
There was a problem hiding this comment.
In this integration specifically, we seem to just want to augment whatever transaction is currently running with the same data as in the AWS Lambda integration.
The question for me is, does this integration even make sense standalone? Because if not, then we don't need to do any sort of post-enrichment here, as it'll already have happened in the AWS Lambda integration. Otherwise, we need to duplicate the enriching logic, but apply it to the current segment instead of creating a new span.
…led and where the Chalice integration is running on its own
Removes the creation of new spans but instead updates the segment of the current span should one be present (this happens if the AWS Lambda integration is also enabled and would be what creates the initial span).
… AWS lambda integration is not
| return { | ||
| "sentry.op": OP.FUNCTION_AWS, | ||
| "sentry.origin": ChaliceIntegration.origin, | ||
| "sentry.span.source": SegmentSource.COMPONENT, | ||
| "cloud.platform": CLOUD_PLATFORM.AWS_LAMBDA, | ||
| "cloud.provider": CLOUD_PROVIDER.AWS, | ||
| "faas.name": aws_context.function_name, | ||
| "cloud.region": aws_region, | ||
| "cloud.resource_id": invoked_arn, | ||
| "aws.lambda.invoked_arn": invoked_arn, | ||
| "faas.invocation_id": aws_context.aws_request_id, | ||
| "faas.version": aws_context.function_version, | ||
| "aws.log.group.names": [aws_context.log_group_name], | ||
| "aws.log.stream.names": [aws_context.log_stream_name], | ||
| } |
There was a problem hiding this comment.
If the AWS Lambda integration is active, it's already attached all (?) of these to the segment, right? So this is for the case when the segment is not coming from the AWS Lambda integration, e.g. custom instrumentation?
I'm fine adding this but unless I'm misunderstanding something it's not necessary for feature parity with the non-streaming code path
sentrivana
left a comment
There was a problem hiding this comment.
See comment but lgtm overall!
| headers = request_dict.get("headers", {}) | ||
|
|
||
| header_attrs: "Dict[str, Any]" = {} | ||
| for header, value in _filter_headers( | ||
| headers, use_annotated_value=False | ||
| ).items(): | ||
| header_attrs[f"http.request.header.{header.lower()}"] = value |
There was a problem hiding this comment.
Bug: The Chalice integration can crash with an AttributeError if the incoming request has a "headers" key with a None value, as the code attempts to call .items() on it.
Severity: HIGH
Suggested Fix
Add a check to ensure headers is a dictionary before it's used. A pattern like if not isinstance(headers, dict): headers = {} should be added after retrieving the headers from request_dict, mirroring the safeguard present in the AWS Lambda integration.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: sentry_sdk/integrations/chalice.py#L91-L97
Potential issue: In the Chalice integration, headers are retrieved from the request
dictionary using `request_dict.get("headers", {})`. This does not guard against the case
where the `"headers"` key exists but its value is `None`. If `headers` is `None`, the
subsequent call to `_filter_headers(headers, ...).items()` will raise an
`AttributeError: 'NoneType' object has no attribute 'items'`, causing the application to
crash. This is a realistic scenario as Chalice uses AWS Lambda events, which can have
`None` as the value for headers, a possibility explicitly handled in the AWS Lambda
integration but missed here.

Add span streaming support behind the trace_lifecycle=stream experiment flag.
When enabled, start a segment span with Lambda/FaaS attributes for both
HTTP view functions and event source handlers instead of setting a
transaction name on the scope.
Also adds aws_lambda to CLOUD_PLATFORM constants.
Contains some duplication with #6498 since it's an AWS lambda framework.
Fixes PY-2312
Fixes #6010