-
Notifications
You must be signed in to change notification settings - Fork 2k
docs: Improve getting started and testing guides for humans and agents #20970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e9fceb7
2ed084f
20db17d
83e2793
350b910
014b72e
37d8daa
d55f664
6411347
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,7 +21,38 @@ | |
|
|
||
| This section describes how you can get started at developing DataFusion. | ||
|
|
||
| ## Windows setup | ||
| ## Quick Start | ||
|
|
||
| For the fastest path to a working local environment, follow these steps | ||
| from the repository root: | ||
|
|
||
| ```shell | ||
| # 1. Install Rust (https://rust-lang.org/tools/install/) and verify the active toolchain with | ||
| rustup show | ||
|
|
||
| # 2. Install protoc 3.15+ (see details below) | ||
| protoc --version | ||
|
|
||
| # 3. Download test data used by examples and many tests | ||
| git submodule update --init --recursive | ||
|
|
||
| # 4. Build the workspace | ||
| cargo build | ||
|
|
||
| # 5. Verify that Rust integration tests can be run | ||
| cargo test -p datafusion --test parquet_integration | ||
|
|
||
| # 6. Verify that sqllogictests can run | ||
| cargo test --profile=ci --test sqllogictests | ||
| ``` | ||
|
|
||
| Notes: | ||
|
|
||
| - The pinned Rust version is defined in `rust-toolchain.toml`. | ||
| - `protoc` is required to compile DataFusion from source. | ||
| - Some tests and examples rely on git submodule data being present locally. | ||
|
|
||
| ## Windows Setup | ||
|
|
||
| ```shell | ||
| wget https://az792536.vo.msecnd.net/vms/VMBuild_20190311/VirtualBox/MSEdge/MSEdge.Win10.VirtualBox.zip | ||
|
|
@@ -34,19 +65,19 @@ cargo build | |
|
|
||
| DataFusion has support for [dev containers](https://containers.dev/) which may be used for | ||
| developing DataFusion in an isolated environment either locally or remote if desired. Using dev containers for developing | ||
| DataFusion is not a requirement by any means but is available for those where doing local development could be tricky | ||
| DataFusion is not a requirement but is available where doing local development could be tricky | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. drive by cleanup to make this more concise |
||
| such as with Windows and WSL2, those with older hardware, etc. | ||
|
|
||
| For specific details on IDE support for dev containers see the documentation for [Visual Studio Code](https://code.visualstudio.com/docs/devcontainers/containers), | ||
| [IntelliJ IDEA](https://www.jetbrains.com/help/idea/connect-to-devcontainer.html), | ||
| [Rust Rover](https://www.jetbrains.com/help/rust/connect-to-devcontainer.html), and | ||
| [GitHub Codespaces](https://docs.github.com/en/codespaces/setting-up-your-project-for-codespaces/adding-a-dev-container-configuration/introduction-to-dev-containers). | ||
|
|
||
| ## Protoc Installation | ||
| ## `protoc` Installation | ||
|
|
||
| Compiling DataFusion from sources requires an installed version of the protobuf compiler, `protoc`. | ||
|
|
||
| On most platforms this can be installed from your system's package manager | ||
| On most platforms this can be installed from your system's package manager. For example: | ||
|
|
||
| ``` | ||
| # Ubuntu | ||
|
|
@@ -71,7 +102,7 @@ libprotoc 3.15.0 | |
|
|
||
| Alternatively a binary release can be downloaded from the [Release Page](https://github.com/protocolbuffers/protobuf/releases) or [built from source](https://github.com/protocolbuffers/protobuf/blob/main/src/README.md). | ||
|
|
||
| ## Bootstrap environment | ||
| ## Bootstrap Environment | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Made heading consistent. |
||
|
|
||
| DataFusion is written in Rust and it uses a standard rust toolkit: | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,6 +23,37 @@ Tests are critical to ensure that DataFusion is working properly and | |
| is not accidentally broken during refactorings. All new features | ||
| should have test coverage and the entire test suite is run as part of CI. | ||
|
|
||
| ## Testing Quick Start | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. THis is based on what started in AGENTS.md but I made it slightly easier to understand |
||
|
|
||
| While developing a feature or bug fix, best practice is to run the smallest set | ||
| of tests that gives confidence for your change, then expand as needed. | ||
|
|
||
| Initially, run the tests in the crates you changed. For example, if you made changes | ||
| to files in `datafusion-optimizer/src`, run the corresponding crate tests: | ||
|
|
||
| ```shell | ||
| cargo test -p datafusion-optimizer | ||
| ``` | ||
|
|
||
| Then, run the `sqllogictest` suite, which provides a strong speed–coverage tradeoff for development: it runs quickly while offering broad regression coverage across most SQL behavior in DataFusion. | ||
|
|
||
| ```shell | ||
| cargo test --profile=ci --test sqllogictests | ||
| ``` | ||
|
|
||
| Finally, before submitting a PR, run the tests for the core `datafusion` and | ||
| `datafusion-cli` crates: | ||
|
|
||
| ```shell | ||
| cargo test -p datafusion | ||
| cargo test -p datafusion-cli | ||
| ``` | ||
|
|
||
| Some integration tests require optional external services such as Docker-backed | ||
| containers and may skip when unavailable. | ||
|
|
||
| ## Testing Overview | ||
|
|
||
| DataFusion has several levels of tests in its [Test Pyramid] and tries to follow | ||
| the Rust standard [Testing Organization] described in [The Book]. | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pulled most of this out of agents.md and left a link there instead