Skip to content

Add canonical semantic and grep skill flows#2

Open
rodion-m wants to merge 2 commits intomainfrom
COD-XXX-search-surface-split
Open

Add canonical semantic and grep skill flows#2
rodion-m wants to merge 2 commits intomainfrom
COD-XXX-search-surface-split

Conversation

@rodion-m
Copy link
Copy Markdown
Member

@rodion-m rodion-m commented Apr 8, 2026

Summary

  • add canonical semantic and grep client methods and a new grep script
  • update skill guidance and CLI smoke coverage for the new search split
  • keep fetch and relationship workflows aligned with canonical MCP/API behavior

Testing

  • pytest -q

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new grep.py script for exact text and regex searches and refactors the existing search functionality into a dedicated semantic_search method. It also adds support for filtering search results by path, extension, and a maximum result count across the CLI tools and the underlying API client. Feedback focuses on improving the robustness of CLI argument parsing, specifically by adding error handling for non-integer inputs to the --max-results flag and ensuring consistent file path extraction logic in the new grep tool.

Comment on lines +26 to +34
location = result.get("location", {})
file_path = location.get("path") or result.get("path")
matches = result.get("matches", [])

output.append(f"\n--- Result #{idx} [{result.get('kind', 'Artifact')}] ---")
if file_path:
output.append(f" File: {file_path}")
if result.get("identifier"):
output.append(f" Identifier: {result['identifier']}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The result formatting in grep.py is missing logic to check for filePath and to extract the file path from the identifier if explicit path fields are missing. This logic is present in search.py and should be included here for consistency and to ensure the file path is displayed whenever possible.

        location = result.get("location", {})
        file_path = location.get("path") or result.get("filePath") or result.get("path")
        identifier = result.get("identifier", "")
        matches = result.get("matches", [])

        if not file_path and identifier and "::" in identifier:
            parts = identifier.split("::")
            if len(parts) >= 2:
                file_path = parts[1]

        output.append(f"\n--- Result #{idx} [{result.get('kind', 'Artifact')}] ---")
        if file_path:
            output.append(f"  File: {file_path}")
        if identifier:
            output.append(f"  Identifier: {identifier}")

Comment on lines +75 to +77
elif arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The int() conversion for --max-results will raise a ValueError and cause the script to crash with a stack trace if a non-integer value is provided. It is better to handle this gracefully with a user-friendly error message.

Suggested change
elif arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
elif arg == "--max-results" and i + 1 < len(sys.argv):
try:
max_results = int(sys.argv[i + 1])
except ValueError:
print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
sys.exit(1)
i += 2

Comment on lines +87 to +89
else:
data_sources.append(arg)
i += 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current argument parsing logic treats any unknown argument as a data source. This can lead to confusing behavior if a user makes a typo in a flag (e.g., --max-result instead of --max-results), as the typo will be added to the list of data sources and likely cause an API error later. It is safer to validate that unknown arguments do not start with --.

        elif arg.startswith("--"):
            print(f"Error: Unknown option '{arg}'", file=sys.stderr)
            sys.exit(1)
        else:
            data_sources.append(arg)
            i += 1

Comment on lines +114 to 116
if arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The int() conversion for --max-results will raise a ValueError if a non-integer value is provided. Handling this gracefully provides a better user experience.

Suggested change
if arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
if arg == "--max-results" and i + 1 < len(sys.argv):
try:
max_results = int(sys.argv[i + 1])
except ValueError:
print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
sys.exit(1)
i += 2

Comment on lines +504 to +506
elif arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The int() conversion for --max-results should be wrapped in a try-except block to avoid a stack trace on invalid user input.

Suggested change
elif arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
elif arg == "--max-results" and i + 1 < len(sys.argv):
try:
max_results = int(sys.argv[i + 1])
except ValueError:
print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
sys.exit(1)
i += 2

Comment on lines +544 to +546
elif arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The int() conversion for --max-results should be wrapped in a try-except block to avoid a stack trace on invalid user input.

Suggested change
elif arg == "--max-results" and i + 1 < len(sys.argv):
max_results = int(sys.argv[i + 1])
i += 2
elif arg == "--max-results" and i + 1 < len(sys.argv):
try:
max_results = int(sys.argv[i + 1])
except ValueError:
print(f"Error: --max-results must be an integer, got '{sys.argv[i + 1]}'", file=sys.stderr)
sys.exit(1)
i += 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant