feat(ollama): add max_retries to chat generator#2899
feat(ollama): add max_retries to chat generator#2899Keyur-S-Patel wants to merge 2 commits intodeepset-ai:mainfrom
Conversation
ae3c423 to
83278a1
Compare
Keyur-S-Patel
left a comment
There was a problem hiding this comment.
Similar to this one #2875
bogdankostic
left a comment
There was a problem hiding this comment.
Thanks for your PR @Keyur-S-Patel! I left two comments regarding the retry logic. Also, it would be great if we could add tests for run_async.
| @retry( | ||
| reraise=True, | ||
| stop=stop_after_attempt(self.max_retries + 1), | ||
| retry=retry_if_exception_type(Exception), |
There was a problem hiding this comment.
Retrying on all kind of exceptions is too broad.
There was a problem hiding this comment.
Retry on 429 and 5xx?
Lemme know if you want to retry on something else as well.
| @retry( | ||
| reraise=True, | ||
| stop=stop_after_attempt(self.max_retries + 1), | ||
| retry=retry_if_exception_type(Exception), | ||
| wait=wait_exponential(), | ||
| ) | ||
| def chat_with_retry() -> ChatResponse | Iterator[ChatResponse]: | ||
| return self._client.chat( | ||
| model=self.model, | ||
| messages=ollama_messages, | ||
| tools=ollama_tools, | ||
| stream=is_stream, # type: ignore[call-overload] # Ollama expects Literal[True] or Literal[False], not bool | ||
| keep_alive=self.keep_alive, | ||
| options=generation_kwargs, | ||
| format=self.response_format, | ||
| think=self.think, | ||
| ) | ||
|
|
||
| response = chat_with_retry() |
There was a problem hiding this comment.
Can we have this not nested inside of run?
There was a problem hiding this comment.
@Retry annotation requires function so can
- use wrapper function or
- Build new method and put retry annotation there
There was a problem hiding this comment.
Will separate out function in next commit
Keyur-S-Patel
left a comment
There was a problem hiding this comment.
Updated new commit with following,
- retry logic to retry on specific codes
- pull the retry out of run
- added unit tests for async
bogdankostic
left a comment
There was a problem hiding this comment.
We're almost there, I just have a minor comment regarding the type of retry_state and a question on tenacity version.
| HTTP_STATUS_SERVER_ERROR_MAX_EXCLUSIVE = 600 | ||
|
|
||
|
|
||
| def _stop_after_instance_max_retries(retry_state: Any) -> bool: |
There was a problem hiding this comment.
Let's not use type Any here
| def _stop_after_instance_max_retries(retry_state: Any) -> bool: | |
| def _stop_after_instance_max_retries(retry_state: RetryCallState) -> bool: |
| :param max_retries: | ||
| Maximum number of retries to attempt for failed requests. |
There was a problem hiding this comment.
Lest's make this parameter more descriptive.
| :param max_retries: | |
| Maximum number of retries to attempt for failed requests. | |
| :param max_retries: | |
| Maximum number of retries to attempt for failed requests (HTTP 429, 5xx, connection/timeout errors). | |
| Uses exponential backoff between attempts. Set to 0 (default) to disable retries. |
| "Programming Language :: Python :: Implementation :: PyPy", | ||
| ] | ||
| dependencies = ["haystack-ai>=2.22.0", "ollama>=0.5.0", "pydantic"] | ||
| dependencies = ["haystack-ai>=2.22.0", "ollama>=0.5.0", "pydantic", "tenacity>=8.2.3"] |
There was a problem hiding this comment.
Is there a specific reason for >=8.2.3 for tenacity?
PR :
feat/ollama-chat-max-retriesRelated Issues
Proposed Changes:
max_retriessupport toOllamaChatGenerator.tenacity-based retries for both sync and async chat calls.wait_exponential().run()/run_async()using@retry-decorated callables instead of separate helper methods.tenacityto the Ollama integration package dependencies.How did you test it?
hatch run fmt-checkhatch run test:typeshatch run test:unitResult:
34 passed)Notes for the reviewer
tenacity.stop_after_attempt(...)uses total attempts, so the implementation usesmax_retries + 1to preserve the expected semantics of “initial call + N retries”.Exceptionfrom Ollama chat calls, matching the previous broad retry behavior while switching to a standard retry library.test:allstill depends on a running local Ollama instance for integration tests; only unit tests were used for branch validation.Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:.