Skip to content

Add Local Scripts to Reproduce Full CI and Perform Auto-Fixes #19227

@2010YOUY01

Description

@2010YOUY01

Is your feature request related to a problem or challenge?

Motivation

AI coding agents are now capable of handling many simple, mechanical tasks in DataFusion. When assigning such tasks, it would be ideal if these agents could verify locally that their changes pass the full CI suite before opening a PR. Currently, the only way to check CI results is to submit a PR and wait for all remote CI jobs to complete, which can take around an hour and slows down iteration.

To improve this workflow, we should provide a simple script that reproduces the entire CI pipeline locally:

./dev/ci.sh # Run full CI locally

Without such a script, AI agents must infer CI behavior from configuration files and may spend unnecessary time/tokens running CI jobs one by one.

Note now we already have a ./dev/rust_lint.sh for all lint related checks, but it's not complete yet, and not include various tests. A full CI reproducing script should be wrap all existing CI test steps into scripts, and use them both in the local CI runner script, and also the GitHub workflow configuration .ymls.

Auto-Fix Script

We should also provide a companion script that performs best-effort automatic fixes:

./auto-fix.sh

This script would handle routine cleanups such as:

  • running cargo fmt
  • generating docs
  • adding Apache headers to newly added files
  • applying safe Clippy auto-fixes
  • any other common mechanical steps

This improves developer experience, and also make AI coding agent iterate faster and spend less token.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions