Skip to content

fix(evals): Use cast instead of parse, fix batch spans#4890

Open
ssbushi wants to merge 1 commit intomainfrom
sb/castNotParse
Open

fix(evals): Use cast instead of parse, fix batch spans#4890
ssbushi wants to merge 1 commit intomainfrom
sb/castNotParse

Conversation

@ssbushi
Copy link
Contributor

@ssbushi ssbushi commented Mar 5, 2026

image image

Checklist (if applicable):

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request streamlines the evaluation process by replacing explicit Zod parsing with direct type casting for certain data structures, simplifying data validation. It also significantly enhances observability for batch evaluations by introducing more granular tracing, ensuring each test case within a batch is individually tracked.

Highlights

  • Replaced Zod parsing with direct type casting: Updated data handling in evaluate.ts to use direct type assertions (as Type) instead of ZodSchema.parse() for Dataset, GenerateActionOptions, and GenerateResponseData, simplifying data validation and reducing runtime overhead.
  • Improved batch evaluation tracing: Refactored the evaluation logic in evaluator.ts to create individual tracing spans for each test case within a batch, providing more granular visibility and debugging capabilities for evaluation runs.
  • Introduced runBatch helper function: Extracted the core logic for processing a batch of evaluations into a new runBatch async function, enhancing code modularity and readability within evaluator.ts.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • genkit-tools/common/src/eval/evaluate.ts
    • Removed DatasetSchema, GenerateActionOptionsSchema, and GenerateResponseSchema imports.
    • Replaced DatasetSchema.parse() with direct type casting (as Dataset).
    • Replaced GenerateActionOptionsSchema.parse() with direct type casting (as GenerateActionOptions).
    • Replaced GenerateResponseSchema.parse() with direct type casting (as GenerateResponseData).
  • js/ai/src/evaluator.ts
    • Removed SPAN_TYPE_ATTR import.
    • Refactored the evaluator function to conditionally create batch spans only for batches with more than one test case.
    • Extracted the logic for running a batch of evaluations into a new runBatch async function.
    • Modified runBatch to create a new tracing span for each individual test case within the batch.
Activity
  • The pull request was opened by ssbushi.
  • The author included two images in the description, likely demonstrating the fix or new behavior.
  • A checklist was provided in the PR description, indicating items like conventional commit title, testing, and documentation updates.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces parse with type assertions in several places within the genkit-tools/common/src/eval/evaluate.ts and js/ai/src/evaluator.ts files. It also removes an unused import and refactors the batch processing logic in js/ai/src/evaluator.ts to improve span creation and error handling.

throw new Error(`Dataset ${datasetId} is empty`);
}
inferenceDataset = DatasetSchema.parse(dataset);
inferenceDataset = dataset as Dataset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

high: Using a type assertion here bypasses the validation provided by DatasetSchema.parse. While this might improve performance, it removes a layer of safety and could lead to runtime errors if the data doesn't conform to the expected schema. Consider adding a comment explaining why the validation is skipped and ensuring that the data source is trusted or validated elsewhere.

Suggested change
inferenceDataset = dataset as Dataset;
inferenceDataset = dataset as Dataset; // Skipping schema validation for performance, ensure data is validated elsewhere

testCaseId: sample.testCaseId ?? generateTestCaseId(),
}));
inferenceDataset = DatasetSchema.parse(rawData);
inferenceDataset = rawData as Dataset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

high: Similar to the previous comment, this type assertion bypasses schema validation. Ensure that rawData conforms to the Dataset type or consider validating it before the assertion.

Suggested change
inferenceDataset = rawData as Dataset;
inferenceDataset = rawData as Dataset; // Skipping schema validation for performance, ensure data is validated elsewhere

renderedPrompt = {
traceId: runActionResponse.telemetry?.traceId!,
result: GenerateActionOptionsSchema.parse(runActionResponse.result),
result: runActionResponse.result as GenerateActionOptions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

high: This type assertion bypasses the validation provided by GenerateActionOptionsSchema.parse. Ensure that runActionResponse.result conforms to the GenerateActionOptions type or consider validating it before the assertion.

      result: runActionResponse.result as GenerateActionOptions; // Skipping schema validation for performance, ensure data is validated elsewhere


function getErrorFromModelResponse(obj: any): string | undefined {
const response = GenerateResponseSchema.parse(obj);
const response = obj as GenerateResponseData;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

high: This type assertion bypasses the validation provided by GenerateResponseSchema.parse. Ensure that obj conforms to the GenerateResponseData type or consider validating it before the assertion.

  const response = obj as GenerateResponseData; // Skipping schema validation for performance, ensure data is validated elsewhere

Comment on lines +373 to +374
batchMetadata?: any
): Promise<EvalResponses> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

medium: Consider adding a type definition for batchMetadata to improve code clarity and maintainability. This will also help prevent unexpected type-related errors in the future.

  options: any,
  batchMetadata?: Record<string, any>
): Promise<EvalResponses> {

Comment on lines +375 to +377
if (batchMetadata) {
batchMetadata.input = batch;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

medium: It would be beneficial to add a check to ensure batchMetadata exists before attempting to access its input property. This can prevent potential runtime errors if batchMetadata is unexpectedly undefined.

  if (batchMetadata) {
    batchMetadata.input = batch;
  }

@ssbushi ssbushi changed the title fix: Use cast instead of parse, fix batch spans fix(genkit-tools/evals): Use cast instead of parse, fix batch spans Mar 5, 2026
@ssbushi ssbushi changed the title fix(genkit-tools/evals): Use cast instead of parse, fix batch spans fix(evals): Use cast instead of parse, fix batch spans Mar 5, 2026
Copy link
Member

@MichaelDoyle MichaelDoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. What was the reason for the casting change, btw?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants