Skip to content

Conversation

@marc-lehner
Copy link
Contributor

No description provided.

@marc-lehner marc-lehner requested a review from a team as a code owner February 9, 2026 11:00
@marc-lehner marc-lehner requested review from Copilot and knime-ghub-bot and removed request for a team February 9, 2026 11:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the bundled environment and LangChain integration to support Python 3.13, including migration to newer/alternate LangChain packages and fixes for evaluation tooling compatibility.

Changes:

  • Migrates several LangChain imports/usages to langchain_classic, langchain_core, and langgraph APIs.
  • Updates Google model integrations to use langchain_google_genai and adjusts API base URL handling.
  • Modernizes the bundled environment (Pixi) for Python 3.13 and adds a SciPy monkey-patch to unblock Giskard imports.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/tools/vectorstore.py Switches retrieval chain imports to langchain_classic.
src/models/google/_port_types.py Updates Google chat/embedding models to langchain_google_genai and adjusts base URL scheme.
src/models/_adapter.py Updates LLM invocation to use invoke().
src/indexes/chroma.py Changes Chroma persistence handling and clarifies hack comment.
src/eval/giskard/giskard_patch.py Adds SciPy monkey-patch to fix Giskard import compatibility.
src/eval/giskard/init.py Ensures monkey-patch is imported/applied early for Giskard.
src/agents/openai.py Switches agent factory import to langchain_classic.
src/agents/base_deprecated.py Updates langgraph react agent wiring and recursion limit handling logic.
src/agents/base.py Migrates execution path to langgraph react agent and tool node usage.
src/agents/_tool.py Switches StructuredTool import to langchain_core.tools.
pixi.toml Reworks dependencies and bumps runtime to Python 3.13.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 543 to 547
from ._agent import RECURSION_CONTINUE_PROMPT
if self._recursion_limit_handling == RecursionLimitModeForView.CONFIRM.name:
message = {
"type": "ai",
"content": ITERATION_CONTINUE_PROMPT,
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RecursionLimitModeForView and ITERATION_CONTINUE_PROMPT are referenced but (based on this diff) are not imported/defined here, while RECURSION_CONTINUE_PROMPT is imported but not used. This will raise NameError at runtime. Make the identifiers consistent (use the newly imported RECURSION_CONTINUE_PROMPT for the message content and append-to-memory call), and either import the correct RecursionLimitModeForView enum or keep using the existing IterationLimitModeForView enum already imported in this module.

Copilot uses AI. Check for mistakes.
Comment on lines 738 to 739
if self.user_message:
messages.append({"role": "user", "content": self.user_message})
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appends a raw dict into messages, but earlier the code builds messages from to_langchain_message(...) (LangChain BaseMessage instances). Mixing dicts and BaseMessage objects is likely to break downstream LangGraph/LangChain processing (e.g., attribute access on message objects). Append a proper LangChain message object instead (e.g., langchain_core.messages.HumanMessage(content=...)) to keep types consistent.

Copilot uses AI. Check for mistakes.
pass


apply_patch()
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling apply_patch() at import time creates unavoidable side effects, and the patch is already triggered by importing this module in src/eval/giskard/__init__.py. Prefer making apply_patch() idempotent and calling it explicitly from __init__.py (or gating the call behind a deliberate entrypoint) so imports remain predictable and testing/mocking is easier.

Suggested change
apply_patch()
if __name__ == "__main__":
apply_patch()

Copilot uses AI. Check for mistakes.
@@ -202,43 +202,39 @@ def save_vectorstore(self, vectorstore_folder: str, vectorstore):
from langchain_chroma import Chroma

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous guard that avoided the hack when the persist directory already matched was removed. If the subsequent hack logic is expensive (e.g., rebuilding a client or rewriting persistence), this can become an always-on cost. If the new langchain_chroma.Chroma no longer exposes _persist_directory, consider reintroducing a lightweight guard using a supported/public way to read the underlying persist directory (e.g., via the Chroma client/settings) and only run the hack when it’s actually needed.

Suggested change
# If the vectorstore already persists to the desired directory, skip the expensive rebuild.
existing_persist_dir = getattr(vectorstore, "persist_directory", None)
if existing_persist_dir is not None and existing_persist_dir == vectorstore_folder:
return

Copilot uses AI. Check for mistakes.
@property
def base_api_url(self) -> str:
return f"{self.location}-aiplatform.googleapis.com"
return f"https://{self.location}-aiplatform.googleapis.com"
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_api_url now always includes the scheme (https://...), but custom_base_api_url may still be provided without a scheme (depending on how it’s configured/serialized elsewhere). In that case, base_url could become an invalid URL. Consider normalizing custom_base_api_url to ensure it includes a scheme (or documenting/enforcing that it must be a full URL) so base_url is consistently well-formed.

Copilot uses AI. Check for mistakes.
max_tokens=max_tokens,
temperature=temperature,
max_retries=2, # default is 6, instead we just try twice before failing
base_url=self.custom_base_api_url or self.base_api_url,
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_api_url now always includes the scheme (https://...), but custom_base_api_url may still be provided without a scheme (depending on how it’s configured/serialized elsewhere). In that case, base_url could become an invalid URL. Consider normalizing custom_base_api_url to ensure it includes a scheme (or documenting/enforcing that it must be a full URL) so base_url is consistently well-formed.

Copilot uses AI. Check for mistakes.
@marc-lehner marc-lehner requested a review from AtR1an February 9, 2026 13:57
Copilot AI review requested due to automatic review settings February 11, 2026 10:47
@AtR1an AtR1an force-pushed the enh/AP-25561-update-bundled-env-to-py-3-13 branch from 192d13c to 7bf0029 Compare February 11, 2026 10:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

max_retries=2, # default is 6, instead we just try twice before failing
base_url=self.custom_base_api_url or self.base_api_url,
credentials=google_credentials,
task_type="RETRIEVAL_QUERY", # for backwards compatibility with old VertexAIEmbeddings
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after '#' in inline comment. Should be '# for backwards compatibility with old VertexAIEmbeddings'

Copilot uses AI. Check for mistakes.

# --- Vector Stores & Indexes ---
faiss-cpu = "*" # Direct: used in indexes/faiss.py
langchain-chroma = "==1.1.0" # Direct: used in indexes/chroma.py installed via pypi because conda packages don't support python 3.13
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is overly long and should be split. Consider moving 'installed via pypi because conda packages don't support python 3.13' to a separate comment line above for better readability.

Suggested change
langchain-chroma = "==1.1.0" # Direct: used in indexes/chroma.py installed via pypi because conda packages don't support python 3.13
# Installed via PyPI because conda packages don't support Python 3.13
langchain-chroma = "==1.1.0" # Direct: used in indexes/chroma.py

Copilot uses AI. Check for mistakes.
Comment on lines +334 to +336
tool_node = ToolNode(tools, handle_tool_errors=True)
agent = create_react_agent(
chat_model, tools=tools, prompt=self.developer_message, checkpointer=memory
chat_model, tools=tool_node, prompt=self.developer_message, checkpointer=memory
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API change passes a ToolNode instance to the 'tools' parameter instead of a list of tools. This breaks the parameter naming convention and could confuse API consumers. Consider renaming the parameter or documenting this breaking change clearly.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 12, 2026 13:31
@tonqui tonqui force-pushed the enh/AP-25561-update-bundled-env-to-py-3-13 branch from c5451ae to 0e38d29 Compare February 12, 2026 13:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

else False,
warnings=warnings if warnings is not None else True,
mode=mode if mode is not None else "python",
by_alias=by_alias if by_alias is not None else True,
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by_alias is forced to True when None, which changes pydantic’s default behavior (pydantic’s model_dump(by_alias=...) default is typically False). This patch may fix the TypeError but can subtly change serialized payloads sent to the Anthropics API. Prefer converting None to the original default (likely False) or dynamically deriving the default from the wrapped function/signature so the patch doesn’t alter output semantics.

Suggested change
by_alias=by_alias if by_alias is not None else True,
by_alias=by_alias if by_alias is not None else False,

Copilot uses AI. Check for mistakes.
Comment on lines +158 to 159
content = self.llm.invoke(text, stop=_stop, **kwargs)
return AIMessage(content=content)
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.invoke() can return different types depending on whether self.llm is an LLM or a chat model (often str vs BaseMessage). Wrapping the return value directly into AIMessage(content=...) will produce incorrect content if the result is already a message object (you’d end up with content=<AIMessage ...>). Normalize the result (e.g., use result.content when available, otherwise use result) before constructing AIMessage.

Copilot uses AI. Check for mistakes.
import langchain.agents

langchain.debug = self.enable_debug_output
from langchain_classic.agents import AgentExecutor
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.enable_debug_output no longer influences LangChain debug output after removing langchain.debug = .... If this flag is still intended to work, reintroduce the debug toggle using the current recommended API (e.g., LangChain globals debug setter) or an equivalent mechanism compatible with the new package split.

Copilot uses AI. Check for mistakes.
agent = agent_obj.create_agent(ctx, tools)

agent_exec = langchain.agents.AgentExecutor(
agent_exec = AgentExecutor(
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.enable_debug_output no longer influences LangChain debug output after removing langchain.debug = .... If this flag is still intended to work, reintroduce the debug toggle using the current recommended API (e.g., LangChain globals debug setter) or an equivalent mechanism compatible with the new package split.

Copilot uses AI. Check for mistakes.
Comment on lines 433 to 434
max_retries=2, # default is 6, instead we just try twice before failing
base_url=self.custom_base_api_url or self.base_api_url,
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_api_url now includes the scheme (https://), but custom_base_api_url (by its name) could plausibly be provided as either a host-only value or a full URL. With base_url=self.custom_base_api_url or self.base_api_url, a host-only custom value would become an invalid URL, while a full URL remains fine. Consider normalizing custom_base_api_url (e.g., add https:// if missing) or documenting/enforcing that custom_base_api_url must include the scheme to avoid ambiguous/fragile configuration.

Copilot uses AI. Check for mistakes.
Comment on lines +420 to 426
class ChatVertexAI(ChatGoogleGenerativeAI):
def with_structured_output(self, *args, method="function_calling", **kwargs):
return super().with_structured_output(*args, method=method, **kwargs)

google_credentials = self._construct_google_credentials()

return ChatVertexAI(
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local subclass is named ChatVertexAI but now inherits from ChatGoogleGenerativeAI, which is confusing given Vertex vs GenAI naming. Consider renaming to something like ChatVertexAICompat/VertexAIChatCompat to better communicate that this is a compatibility shim rather than the original VertexAI class.

Suggested change
class ChatVertexAI(ChatGoogleGenerativeAI):
def with_structured_output(self, *args, method="function_calling", **kwargs):
return super().with_structured_output(*args, method=method, **kwargs)
google_credentials = self._construct_google_credentials()
return ChatVertexAI(
class ChatVertexAICompat(ChatGoogleGenerativeAI):
def with_structured_output(self, *args, method="function_calling", **kwargs):
return super().with_structured_output(*args, method=method, **kwargs)
google_credentials = self._construct_google_credentials()
return ChatVertexAICompat(

Copilot uses AI. Check for mistakes.
@@ -202,43 +202,39 @@ def save_vectorstore(self, vectorstore_folder: str, vectorstore):
from langchain_chroma import Chroma

vectorstore: Chroma = vectorstore
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous guard that avoided running the persist-directory workaround unless necessary was removed. If the subsequent hack reinitializes clients or reconfigures persistence, this will now happen on every save, potentially adding overhead. Consider restoring a cheap check (where possible) to skip the workaround when the vectorstore is already configured for the target folder.

Suggested change
vectorstore: Chroma = vectorstore
vectorstore: Chroma = vectorstore
# If the vectorstore is already configured to persist to the target folder,
# skip the workaround to avoid unnecessary reinitialization.
current_persist_dir = getattr(vectorstore, "persist_directory", None)
if current_persist_dir and current_persist_dir == vectorstore_folder:
return

Copilot uses AI. Check for mistakes.
AtR1an and others added 18 commits February 12, 2026 14:34
Which now provides a unified API for Google AI Studio and Vertex AI
Because chroma no longer provides the information where the vectorstore is persisted.
AP-25545 (Update to latest langchain 1.x version)
@tonqui tonqui force-pushed the enh/AP-25561-update-bundled-env-to-py-3-13 branch from 0e38d29 to 7e71aa0 Compare February 12, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants