Validate Transport Support For Bidirectional Streaming#717
Conversation
Most Python HTTP clients cannot read response data while the request body is still being written, so bidirectional event stream operations fail mid-stream with an opaque connection error. Transports must now opt in by setting SUPPORTS_DUPLEX_STREAMING to True, and RequestPipeline.duplex_stream fails fast with UnsupportedTransportError when the configured transport does not declare support. AWSCRTHTTPClient and MockHTTPClient declare support; AIOHTTPClient explicitly does not.
There was a problem hiding this comment.
Non-blocking: We should definitely look at making a shared testing harness to wire up these mocks rather than needing to create them all inline. I can see cases where we need to test pipeline behavior in the future and will need to recreate this all over again. Doesn't block this PR but something we should consider soon.
| """ | ||
| # The transport is assumed not to support duplex streaming unless it | ||
| # explicitly declares otherwise. | ||
| if not getattr(self.transport, "SUPPORTS_DUPLEX_STREAMING", False): |
There was a problem hiding this comment.
This is interesting - it's kind of a blind "set this attribute". Is there a reason we aren't just adding this as a default on the HTTPClient class? Or ifwe don't want to tie it to HTTP, the ClientTransport protocol?
Adding a default makes more sense to me than doing a getattr call. What do you think?
See the python docs for an example of how to do this (look at the value key in the second example)
There was a problem hiding this comment.
This PR does add the default on ClientTransport. I kept the getattr(..., False) intentionally because ClientTransport is a structural Protocol, so Python does not enforce that custom transports actually inherit the protocol or have the new attribute at runtime. Existing transports may satisfy the old transport shape but not define SUPPORTS_DUPLEX_STREAMING. In that case, missing should mean unsupported, and we still want to raise UnsupportedTransportError rather than AttributeError.
Overview
Problem
Only the AWS CRT HTTP client can perform true bidirectional (duplex) streaming in the Python ecosystem, since interleaving request and response data in practice requires HTTP/2. Code generation already defaults services with event streams to
AWSCRTHTTPClient, but nothing stops a user from overridingconfig.transportwithAIOHTTPClientor a custom transport that can't do this. When they then invoke a bidirectional operation, the request is sent and the stream appears to hang until it eventually dies with an error such as:smithy_core.exceptions.SmithyError: Server disconnectedNothing in that error points at the actual problem (the configured transport), so this is painful to debug.
Solution
This PR makes duplex support an explicit, opt-in transport capability. A new
SUPPORTS_DUPLEX_STREAMINGclass attribute onClientTransportdefaults toFalse, andRequestPipeline.duplex_streamnow fails fast with an actionableUnsupportedTransportErrorbefore any request work happens.AWSCRTHTTPClientdeclares support,AIOHTTPClientexplicitly does not (it buffers the full response before returning and aiohttp has no HTTP/2 support).Example
Invoking a bidirectional operation with a non-duplex transport, with this PR:
Without this PR: the request is sent, the stream stalls, and eventually fails mid-stream with
SmithyError: Server disconnected.With this PR: the call fails immediately, before anything is sent:
Why opt-in rather than opt-out? A survey of the Python HTTP client landscape shows most clients cannot interleave request and response data, so absence of a declaration is treated as "not supported". The pipeline reads the attribute with
getattr(transport, "SUPPORTS_DUPLEX_STREAMING", False), so existing custom transports that predate the attribute keep working for everything except duplex operations, which were already broken for them.Why fail at call time instead of client construction? A non-duplex transport is perfectly valid for every other operation on the same client. The capability is only required when a duplex operation is actually invoked, so that is where validation happens.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.