Bug: UnboundLocalError crash in RAG fallback path + streaming endpoint yields `None` to clients

While reviewing the backend source code, I found 4 bugs, 2 of which can cause runtime crashes or corrupt client output. These affect both the API server and the CLI chatbot.

Bug 1:  UnboundLocalError crash in retriever_rag.py
File: [backend/src/agents/retriever_rag.py](https://github.com/The-OpenROAD-Project/ORAssistant/blob/main/backend/src/agents/retriever_rag.py)

When inbuilt_tool_calling is False and the LLM response does not contain the key "tool_names", the variable tool_calls is never assigned, but is still referenced on line 177:

```python
# Line 162-177
if "tool_names" in str(response):
    tool_calls = response.get("tool_names", [])  # ← only assigned inside this if-block
    for tool in tool_calls:
        if tool not in self.tool_names:
            logging.warning(f"Tool {tool} not found in tool list.")
            tool_calls.remove(tool)
else:
    logging.warning(str(response))
    logging.warning("Tool selection failed. Returning empty tool list.")
    # ← tool_calls is NOT assigned here!
if "rephrased_question" in str(response):
    state["messages"][-1].content = response.get("rephrased_question")
else:
    logging.warning("Rephrased question not found in response.")
return {"tools": tool_calls}  # ← 💥 UnboundLocalError if "tool_names" was not in response
``
Impact: Any user query that triggers the non-inbuilt-tool-calling path and the LLM returns a response without "tool_names" will crash the entire request with UnboundLocalError: local variable 'tool_calls' referenced before assignment.

Fix:

```python
# Initialize before the if/else block:
tool_calls: list = []
if "tool_names" in str(response):
    tool_calls = response.get("tool_names", [])
    ...
```
Additionally, the tool_calls.remove(tool) on line 167 mutates the list while iterating over it, which can skip elements. This should use a list comprehension filter instead.

Bug 2: Streaming endpoint yields None to clients
File: [backend/src/api/routers/conversations.py](https://github.com/The-OpenROAD-Project/ORAssistant/blob/main/backend/src/api/routers/conversations.py)
```python
if chunk == "on_chat_model_stream" and current_llm_call_count == 2:
    ...
    if isinstance(message_content, AIMessageChunk):
        msg = message_content.content
    else:
        msg = None  # ← msg is set to None
    if msg:
        chunks.append(str(msg))
    yield str(msg) + "\n\n"  # ← yields "None\n\n" to the client!
```
The yield on line 408 is outside the if msg: guard, so when msg is None (non-AIMessageChunk events), the string "None\n\n" is streamed to the frontend.

Impact: Users see the literal text None randomly interspersed in streamed responses.

Fix:

```python
if msg:
    chunks.append(str(msg))
    yield str(msg) + "\n\n"  # ← move yield inside the guard
```
Bug 3: str.replace() result discarded, tool descriptions never cleaned
File: [backend/src/agents/retriever_rag.py](https://github.com/The-OpenROAD-Project/ORAssistant/blob/main/backend/src/agents/retriever_rag.py)
```python
text_desc = render_text_description([tool])
text_desc.replace("(query: str) -> Tuple[str, list[str], list[str]]", " ")  # ← result not saved!
self.tool_descriptions += text_desc + "\n\n"
```
Python strings are immutable, str.replace() returns a new string, but the result is never assigned back. The original text_desc (with the noisy type signature) is used as-is, making tool descriptions sent to the LLM unnecessarily verbose.

Fix:

```python
text_desc = text_desc.replace("(query: str) -> Tuple[str, list[str], list[str]]", " ")
```
Bug 4: In-memory chat_history dict grows unboundedly (memory leak)
File: [backend/src/api/routers/conversations.py](https://github.com/The-OpenROAD-Project/ORAssistant/blob/main/backend/src/api/routers/conversations.py)

```python
chat_history: dict[UUID, list[dict[str, str]]] = {}
```
When the database is disabled (USE_DB=false), every new conversation UUID is added to this global dict but never evicted. Over time, this causes unbounded memory growth in long-running deployments.

Suggestion: Use collections.OrderedDict with a maximum size, or an LRU cache (e.g., functools.lru_cache or cachetools.LRUCache), to cap the number of in-memory conversations.

I am happy to open a PR fixing all 4 bugs if this issue is confirmed. The fixes are straightforward and can be done in a single PR with corresponding test updates. please assign this issue to me @luarss 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: UnboundLocalError crash in RAG fallback path + streaming endpoint yields `None` to clients #243

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: UnboundLocalError crash in RAG fallback path + streaming endpoint yields None to clients #243

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug: UnboundLocalError crash in RAG fallback path + streaming endpoint yields `None` to clients #243