Skip to content

fix: add default HTTP request timeout to prevent indefinite hangs#938

Open
Alfredo Garcia (agarctfi) wants to merge 3 commits intomainfrom
devin/1772830465-add-default-http-timeout
Open

fix: add default HTTP request timeout to prevent indefinite hangs#938
Alfredo Garcia (agarctfi) wants to merge 3 commits intomainfrom
devin/1772830465-add-default-http-timeout

Conversation

@agarctfi
Copy link
Contributor

@agarctfi Alfredo Garcia (agarctfi) commented Mar 6, 2026

Summary

Adds a default timeout of (30s connect, 300s read) to all HTTP requests made via HttpClient.send_request(). Currently, no timeout is set on outbound HTTP requests, which means if a server accepts a TCP connection but never responds (or stalls mid-transfer), the connector process hangs indefinitely — no error, no log output, no retry.

Root cause: Observed in production on source-intercom where a sync hung permanently after an Intercom API 500 error. The CDK's retry logic succeeded, but a subsequent request to the Intercom API hung with zero output for 80+ minutes. The requests.Session.send() call blocked forever because no timeout parameter was passed.

The timeout is only applied when no explicit timeout is already present in request_kwargs, so callers can still override it. ConnectTimeout and ReadTimeout are already handled as TRANSIENT_EXCEPTIONS in rate_limiting.py and will be retried automatically with exponential backoff.

Review & Testing Checklist for Human

  • Validate timeout values for slow APIs: The 300s read timeout may be too short for connectors with endpoints that legitimately take >5 minutes to respond (e.g., bulk export/report generation endpoints). Review whether any known connectors would be broken by this.
  • Consider adding a unit test: No new test was added to verify the default timeout is applied or that explicit timeout overrides are respected. This is a gap.
  • Consider making timeout configurable: The timeout is currently hardcoded as class constants. Should this be exposed as a configurable option in the declarative manifest schema (HttpRequester) so connectors can tune it?
  • Verify send_request is the right interception point: The timeout is added in send_request() only. If any code path calls _send_with_retry() or _send() directly, those requests won't get a default timeout.

Notes

Summary by CodeRabbit

  • Bug Fixes
    • HTTP requests now use sensible default connect/read timeouts to prevent operations from hanging indefinitely.
  • Tests
    • Added unit tests to validate default and explicit timeout behavior for HTTP requests.

When an HTTP server accepts a connection but never responds (or stalls
mid-transfer), the requests library will block indefinitely if no timeout
is set. This causes connector syncs to hang permanently with no log
output, no error, and no retry.

This adds a default timeout of (30s connect, 300s read) to all HTTP
requests made via HttpClient.send_request(). The timeout is only applied
when no explicit timeout is provided in request_kwargs, so callers can
still override it.

ConnectTimeout and ReadTimeout exceptions are already handled as
TRANSIENT_EXCEPTIONS in rate_limiting.py and will be retried
automatically with exponential backoff.

Root cause: observed in source-intercom where a sync hung permanently
after an Intercom API 500 error. The retry succeeded but a subsequent
request hung indefinitely because no timeout was configured.

Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
Copilot AI review requested due to automatic review settings March 6, 2026 20:56
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

github-actions bot commented Mar 6, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1772830465-add-default-http-timeout#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1772830465-add-default-http-timeout

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a default requests timeout to outbound HTTP calls made through airbyte_cdk’s HttpClient.send_request() to prevent indefinite hangs when servers stall mid-request/response.

Changes:

  • Introduce default connect/read timeout constants on HttpClient.
  • Apply a default (connect, read) timeout tuple to request_kwargs when no explicit timeout is provided.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 591 to 598
request_kwargs = {**request_kwargs, **env_settings}

if "timeout" not in request_kwargs:
request_kwargs["timeout"] = (self._DEFAULT_CONNECT_TIMEOUT, self._DEFAULT_READ_TIMEOUT)

response: requests.Response = self._send_with_retry(
request=request,
request_kwargs=request_kwargs,
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request_kwargs is typed as Mapping[str, Any], but this code mutates it via request_kwargs["timeout"] = .... With the repo's strict mypy settings, this will raise a type error because Mapping is read-only. Consider copying into a Dict[str, Any]/MutableMapping[str, Any] local (or changing the parameter type) before setting defaults so mypy passes.

Suggested change
request_kwargs = {**request_kwargs, **env_settings}
if "timeout" not in request_kwargs:
request_kwargs["timeout"] = (self._DEFAULT_CONNECT_TIMEOUT, self._DEFAULT_READ_TIMEOUT)
response: requests.Response = self._send_with_retry(
request=request,
request_kwargs=request_kwargs,
mutable_request_kwargs: Dict[str, Any] = {**request_kwargs, **env_settings}
if "timeout" not in mutable_request_kwargs:
mutable_request_kwargs["timeout"] = (self._DEFAULT_CONNECT_TIMEOUT, self._DEFAULT_READ_TIMEOUT)
response: requests.Response = self._send_with_retry(
request=request,
request_kwargs=mutable_request_kwargs,

Copilot uses AI. Check for mistakes.
Comment on lines +593 to +595
if "timeout" not in request_kwargs:
request_kwargs["timeout"] = (self._DEFAULT_CONNECT_TIMEOUT, self._DEFAULT_READ_TIMEOUT)

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new default-timeout behavior but there isn't a unit test asserting (a) the default timeout is applied when request_kwargs has no timeout, and (b) an explicit timeout is respected. Adding a focused test would help prevent regressions (especially since this behavior is safety-critical for avoiding indefinite hangs).

Copilot uses AI. Check for mistakes.
- Use Dict[str, Any] (mutable) instead of mutating Mapping[str, Any]
  for request_kwargs to satisfy mypy type checking
- Add test_send_request_applies_default_timeout_when_not_provided
- Add test_send_request_respects_explicit_timeout

Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b743f2c5-682f-4fb0-9a3f-9249ee091cd2

📥 Commits

Reviewing files that changed from the base of the PR and between 0ddff77 and d2233c5.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/streams/http/http_client.py
  • unit_tests/sources/streams/http/test_http_client.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte_cdk/sources/streams/http/http_client.py

📝 Walkthrough

Walkthrough

Added two class-level defaults to HttpClient (_DEFAULT_CONNECT_TIMEOUT = 30, _DEFAULT_READ_TIMEOUT = 300). send_request now injects a timeout tuple using these defaults when request_kwargs does not include an explicit timeout.

Changes

Cohort / File(s) Summary
HttpClient default timeouts
airbyte_cdk/sources/streams/http/http_client.py
Added _DEFAULT_CONNECT_TIMEOUT and _DEFAULT_READ_TIMEOUT constants and updated send_request to supply a default (connect, read) timeout tuple when timeout is absent from request_kwargs.
Timeout handling tests
unit_tests/sources/streams/http/test_http_client.py
Added tests verifying that send_request applies the default timeout tuple when none is provided and preserves an explicitly supplied timeout value.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding default HTTP request timeouts to prevent indefinite hangs in HttpClient.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1772830465-add-default-http-timeout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: alfredo.garcia@airbyte.io <freddy.garcia7.fg@gmail.com>
@github-actions
Copy link

github-actions bot commented Mar 6, 2026

PyTest Results (Fast)

3 892 tests  +2   3 880 ✅ +2   6m 42s ⏱️ -9s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit d2233c5. ± Comparison against base commit 6136336.

@github-actions
Copy link

github-actions bot commented Mar 6, 2026

PyTest Results (Full)

3 895 tests  +2   3 883 ✅ +2   11m 10s ⏱️ +38s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit d2233c5. ± Comparison against base commit 6136336.

@devin-ai-integration
Copy link
Contributor

Closing this PR — the connector is pinned to a dev image from PR #870, so this fix needs to be applied on top of that branch instead. A new PR will be created targeting the tolik0/concurrent-source/add-block_simultaneous_read branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants