Skip to content

[fix] Use cursor.description to detect result-returning queries#75

Open
jonasbrami wants to merge 1 commit into
apache:masterfrom
jonasbrami:fix-cte-cursor-description
Open

[fix] Use cursor.description to detect result-returning queries#75
jonasbrami wants to merge 1 commit into
apache:masterfrom
jonasbrami:fix-cte-cursor-description

Conversation

@jonasbrami

Copy link
Copy Markdown
Contributor

Replace brittle sql_upper.startswith() keyword check with cursor.description, which reliably detects any statement that returns a result set (SELECT, SHOW, DESCRIBE, EXPLAIN, CTEs, etc.) without maintaining a hardcoded list of SQL keywords.

Replace brittle sql_upper.startswith() keyword check with
cursor.description, which reliably detects any statement that
returns a result set (SELECT, SHOW, DESCRIBE, EXPLAIN, CTEs, etc.)
without maintaining a hardcoded list of SQL keywords.
@catpineapple

Copy link
Copy Markdown
Member

Blocking: missing regression tests.

This classification branch has already caused Issue #62 Bug 5 once, and the cases motivating this PR (leading comments, parenthesized SELECT) aren't covered by tests either. Without tests, the same bug class will regress again.

Please add regression tests covering at least:

  • /* c */ SELECT 1 — leading SQL comment
  • WITH t AS (SELECT 1) SELECT * FROM t — CTE, guards Issue [some bugs] #62 Bug 5
  • SHOW TABLES / DESC <tbl> / EXPLAIN SELECT 1
  • INSERT / UPDATE / CREATE TABLE — locks in the else branch behavior

@catpineapple catpineapple left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution, but some additional test cases are needed to ensure the robustness of doris MCP.

FreeOnePlus pushed a commit that referenced this pull request May 21, 2026
* [fix] Use cursor.description to detect result-returning queries

Replace brittle sql_upper.startswith() keyword check with
cursor.description, which reliably detects any statement that
returns a result set (SELECT, SHOW, DESCRIBE, EXPLAIN, CTEs, etc.)
without maintaining a hardcoded list of SQL keywords.

* Fix max_rows bypass for comment-prefixed SELECT in execute_sql_for_mcp

Same root cause as #75: sql.upper().startswith("SELECT") at
query_executor.py:689 returns False when SQL begins with a leading
'--' or '/* */' comment, so the auto-injected LIMIT {max_rows} is
silently skipped and large queries can be dispatched unbounded.

This callsite runs before cursor.execute, so cursor.description is not
available. Use a small helper that strips leading SQL comments before
extracting the first keyword.

* Add regression tests for result-set detection contract

These tests pin the user-facing contract of DorisConnection.execute():
any SQL the driver reports as producing a result set must return its
rows, regardless of how the statement is phrased.

Guards against regression of:
- Issue #62 Bug 5 (CTE / WITH returning empty data)
- The leading-comment bug fixed in #75 (data: [] with row_count > 0
  for SELECT prefixed by '--' or '/* */')
- The same-root-cause LIMIT bypass fixed in the previous commit

Also adds unit tests for the new get_first_sql_keyword helper.

---------

Co-authored-by: jonas.brami <jonasbrami@gmail.com>
@mortalBibo

Copy link
Copy Markdown
Contributor

Hi @jonasbrami, heads up — #87 has been merged. It preserves your commit as-is at the base of the branch (so you remain the author of the cursor.description fix) and adds the regression tests requested by @catpineapple, plus a sister fix for the same root cause in query_executor.py:689.

The underlying bug is now fixed on master, so this PR can be closed at your convenience. Thanks for the original fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants