Skip to content

docs: Improve getting started and testing guides for humans and agents#20970

Open
alamb wants to merge 6 commits intoapache:mainfrom
alamb:alamb/update_dev_docs_for_agents
Open

docs: Improve getting started and testing guides for humans and agents#20970
alamb wants to merge 6 commits intoapache:mainfrom
alamb:alamb/update_dev_docs_for_agents

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 16, 2026

Which issue does this PR close?

  • Closes #.

Rationale for this change

This PR is a follow-up to #20939 from @Dandandan.

The goal is to make it easier for both humans and agents to get started making changes in this repository and to create pull requests efficiently.

The repository already had the necessary contributor information, but it was spread across multiple documents and not easy to discover quickly.

What changes are included in this PR?

This PR makes the most important setup, testing, and pre-PR checks easier to find from the contributor guide and from AGENTS.md:

  • Add a quick-start setup section to the contributor guide with the shortest path to a working local environment.
  • Add a testing quick-start section summarizing the most important tests to run before submitting a PR.
  • Add a “Before Submitting a PR” section to centralize formatting and lint guidance.
  • Update AGENTS.md to point to the canonical contributor guide sections instead of duplicating setup and testing instructions.

Are these changes tested?

This PR updates documentation only.

Are there any user-facing changes?

This improves contributor-facing documentation and makes setup / testing guidance easier to discover, but it does not change DataFusion runtime behavior or public APIs.

@alamb alamb added the documentation Improvements or additions to documentation label Mar 16, 2026
## Windows setup
## Quick Start

For the fastest path to a working local environment, follow these steps
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled most of this out of agents.md and left a link there instead

Alternatively a binary release can be downloaded from the [Release Page](https://github.com/protocolbuffers/protobuf/releases) or [built from source](https://github.com/protocolbuffers/protobuf/blob/main/src/README.md).

## Bootstrap environment
## Bootstrap Environment
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made heading consistent.

DataFusion has support for [dev containers](https://containers.dev/) which may be used for
developing DataFusion in an isolated environment either locally or remote if desired. Using dev containers for developing
DataFusion is not a requirement by any means but is available for those where doing local development could be tricky
DataFusion is not a requirement but is available where doing local development could be tricky
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by cleanup to make this more concise

is not accidentally broken during refactorings. All new features
should have test coverage and the entire test suite is run as part of CI.

## Testing Quick Start
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is based on what started in AGENTS.md but I made it slightly easier to understand

AGENTS.md Outdated
- `cargo clippy` catches common mistakes and enforces idiomatic Rust patterns. All warnings must be resolved (treated as errors via `-D warnings`).

Do not commit code that fails either of these checks.
See [Before Submitting a PR](docs/source/contributor-guide/index.md#before-submitting-a-pr)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved and expanded content

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience is agents will read the .md file on start, but will not always go read all the materials linked - still helps because they are found by grep etc. tools.

So to make sure they read things like cargo fmt you'll have to put it in the AGENTS.md

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps keep the duplication for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps we can try to verify this works (or be more explicit it must read docs/source/contributor-guide/index.md#before-submitting-a-pr before PRing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 350b910

@alamb alamb marked this pull request as ready for review March 16, 2026 19:34
AGENTS.md Outdated

Do not commit code that fails either of these checks.
See [Before Submitting a PR](docs/source/contributor-guide/index.md#before-submitting-a-pr)
for the required formatting and lint checks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also link the PR template before creating a PR?

https://github.com/apache/datafusion/blob/main/.github/pull_request_template.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 350b910

@Dandandan Dandandan changed the title docs: Improve getting started and testing guides for humans and hgents docs: Improve getting started and testing guides for humans and agents Mar 17, 2026
Copy link
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great! It's always a bit hard to see how agents will react to these instructions. I see Daniel left some feedback, we can address before merging but can also merge and tweak based on reports of real world results / things that agents seem to get wrong repeatedly.

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, good to see an AGENTS.md! Also left some small suggestions.

Comment on lines +38 to +39
Then, run the `sqllogictest` suite, which is the main regression suite for SQL
behavior and covers most DataFusion features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Then, run the `sqllogictest` suite, which is the main regression suite for SQL
behavior and covers most DataFusion features.
Then, run the `sqllogictest` suite, which provides a strong speed–coverage tradeoff for development: it runs quickly while offering broad regression coverage across most SQL behavior in DataFusion.

Comment on lines +110 to +111
./ci/scripts/rust_fmt.sh
./ci/scripts/rust_clippy.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
./ci/scripts/rust_fmt.sh
./ci/scripts/rust_clippy.sh
./dev/rust_lint.sh
# use the `--write` flag to automatically fix some formatting and lint errors
# ./dev/rust_lint.sh --write --allow-dirty

This script is the entry point for all non-functional tests. It includes the previous two scripts as well as several others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants