🐝 CommitBee

The commit message generator that actually understands your code.

Most tools in this space pipe raw git diff to an LLM and hope for the best. CommitBee parses your code with tree-sitter, maps diff hunks to symbol spans, and gives the LLM structured semantic context — producing fundamentally better commit messages, especially for complex multi-file changes.

✨ What Sets CommitBee Apart

🌳 It reads your code, not just your diffs

CommitBee uses tree-sitter to parse both the staged and HEAD versions of every changed file — in parallel across CPU cores. It extracts 10 symbol types (functions, methods, structs, enums, traits, impls, classes, interfaces, constants, type aliases) with full signatures — the LLM sees pub fn connect(host: &str, timeout: Duration) -> Result<Connection>, not just "Function connect." Methods include their parent context (impl Server > connect), so the LLM knows where a symbol lives, not just its name. Modified symbols show old → new signature diffs, and structural AST diffs (fully implemented for structs and enums) break down exactly what changed per symbol — parameters added, return type altered, visibility widened, or body-only edits. Cross-file relationships are detected automatically: if validator.rs calls parse() and both changed, the prompt says so. When source and test files are both staged, the prompt links them as related files. Symbols are tracked in three states: added, removed, and modified-signature.

Supported languages: Rust, TypeScript, JavaScript, Python, Go, Java, C, C++, Ruby, C# — all enabled by default, individually toggleable via Cargo feature flags. Files in other languages still get full diff context — just without symbol extraction.

🧠 It reasons about what changed

Before the LLM generates anything, CommitBee computes deterministic evidence from your code and encodes it as hard constraints in the prompt:

Bug-fix evidence in the diff → fix. No bug evidence → the LLM can't call it a fix.
Formatting-only changes (whitespace, import reordering) → style. Detected both heuristically and via per-symbol whitespace classification.
Doc-vs-code distinction — changes that only touch doc comments are classified separately from code changes, preventing refactor when docs is correct.
Import change detection — added/removed imports are surfaced explicitly so the LLM can distinguish dependency wiring from logic changes.
Test-to-code ratio — when >80% of additions are test code, the type is inferred as test.
Dependency-only changes → chore. Always.
Public API removed → breaking change flagged automatically.
MSRV bumps, engines.node, requires-python changes → metadata-aware breaking detection.

Commit types are driven by code analysis, not LLM guesswork. The prompt includes computed EVIDENCE flags, CONSTRAINTS the model must follow, the primary change for subject anchoring, a character budget for the subject line, and anti-hallucination rules with negative examples.

✅ It validates and corrects its own output

Every generated message passes through a 7-rule validation pipeline:

Fix requires evidence — no bug comments, no fix type
Breaking change detection — removed public APIs must be flagged
Anti-hallucination — breaking change text can't copy internal field names
Mechanical changes must use style
Dependency-only changes must use chore
Subject specificity — rejects generic messages like "update code" or "improve things"
Subject length — enforces the 72-character first line limit

If any rule fails, CommitBee appends targeted correction instructions and re-prompts the LLM — up to 3 attempts, re-validating after each. The final output goes through a sanitizer that strips thinking blocks, extracts JSON from code fences, removes conversational preambles, and wraps the body at 72 characters. You get a clean, spec-compliant conventional commit or a clear error — never a silently mangled message.

🔀 It splits multi-concern commits

When your staged changes mix independent work (a bugfix in one module + a refactor in another), CommitBee detects it and offers to split them into separate, well-typed commits. The splitter uses diff-shape fingerprinting combined with Jaccard similarity on content vocabulary — files are grouped by the actual shape and language of their changes, not just by directory. Symbol dependency merging keeps related files together even when their diff shapes differ: if foo() is removed from one file and added in another, they stay in the same commit.

⚡ Commit split suggested — 2 logical change groups detected:

  Group 1: feat(llm)  [2 files]
    [M] src/services/llm/anthropic.rs (+20 -5)
    [M] src/services/llm/openai.rs (+8 -3)

  Group 2: fix(sanitizer)  [1 file]
    [M] src/services/sanitizer.rs (+3 -1)

? Split into separate commits? (Y/n)

The pipeline

┌─────────┐    ┌──────────┐    ┌────────────┐    ┌──────────┐    ┌───────────┐    ┌─────────┐
│  Stage  │ →  │   Git    │ →  │ Tree-sitter│ →  │  Split   │ →  │  Context  │ →  │   LLM   │
│ Changes │    │  Service │    │  Analyzer  │    │ Detector │    │  Builder  │    │Provider │
└─────────┘    └──────────┘    └────────────┘    └──────────┘    └───────────┘    └─────────┘
                    │                │                 │                │               │
               Staged diff      Symbol spans     Group files      Budget-aware     Commit message
               + file list      (functions,      by module,       prompt with      (conventional
                                classes, etc.)   suggest split    semantic context    format)

And there's more

🏠 Local-first — Ollama by default. Your code never leaves your machine. No API keys needed.
🔒 Secret scanning — 24 built-in patterns across 13 categories. Full diff scanning before truncation with accurate hunk-based source line numbering. Add custom patterns or disable built-ins via config.
⚡ Streaming — Real-time token display from all 3 providers (Ollama, OpenAI, Anthropic) with Ctrl+C cancellation.
📊 Token budget — Smart truncation that prioritizes the most important files within ~6K tokens. Automatically rebalances when structural diffs are available.
🎯 Multi-candidate & editing — Generate up to 5 messages, pick the best one, and refine it in your system editor before committing.
🪝 Git hooks — prepare-commit-msg hook with TTY detection for safe non-interactive fallback.
🔍 Prompt debug — --show-prompt shows exactly what the LLM sees. Full transparency.
🩺 Doctor — commitbee doctor checks config, connectivity, and model availability.
🐚 Shell completions — bash, zsh, fish, powershell via commitbee completions.
⚙️ 5-level config — Defaults → project .commitbee.toml → user config → env vars → CLI flags.
🦀 Single binary — ~18K lines of Rust. Compiles to one static binary with LTO. No runtime dependencies.
🧪 440 tests — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).

📦 Installation

From source

cargo install commitbee

Build from repository

git clone https://github.com/sephyi/commitbee.git
cd commitbee
cargo build --release

The binary will be at ./target/release/commitbee.

Requirements

Rust 1.94+ (edition 2024)
Ollama running locally (default provider) — Install Ollama
A model pulled in Ollama (recommended: qwen3.5:4b)

ollama pull qwen3.5:4b

🚀 Quick Start

# Stage your changes
git add src/feature.rs

# Generate and commit interactively
commitbee

# Preview without committing
commitbee --dry-run

# Auto-confirm and commit
commitbee --yes

# See what the LLM sees
commitbee --show-prompt

That's it. CommitBee works with zero configuration if Ollama is running locally.

If CommitBee saves you time, consider sponsoring the project 💛

📖 Documentation

Full Guide — configuration, providers, splitting, validation, troubleshooting
PRD & Roadmap — product requirements and future plans

🔧 Configuration

Run commitbee init to create a config file. Works out of the box with zero config if Ollama is running locally.

See Configuration for the full config reference, environment variables, and layering priority.

💻 Usage

commitbee [OPTIONS] [COMMAND]

Options

Flag	Description
`--dry-run`	Print message only, don't commit
`--yes`	Auto-confirm and commit
`-n, --generate`	Generate N candidates (1-5, default 1)
`--no-split`	Disable commit split suggestions
`--no-scope`	Disable scope in commit messages
`--clipboard`	Copy message to clipboard instead of committing
`--exclude <GLOB>`	Exclude files matching glob pattern (repeatable)
`--allow-secrets`	Allow committing with detected secrets
`--verbose`	Show symbol extraction details
`--show-prompt`	Debug: display the full LLM prompt

Commands

Command	Description
`init`	Create a config file
`config`	Show current configuration
`doctor`	Check configuration and connectivity
`completions <shell>`	Generate shell completions
`hook install`	Install prepare-commit-msg hook
`hook uninstall`	Remove prepare-commit-msg hook
`hook status`	Check if hook is installed

🔒 Security

CommitBee scans all content before it's sent to any LLM provider with 24 built-in patterns across 13 categories:

☁️ Cloud providers — AWS access/secret keys, GCP service accounts & API keys, Azure storage keys
🤖 AI/ML — OpenAI, Anthropic, HuggingFace tokens
🔧 Source control — GitHub (PAT, fine-grained, OAuth), GitLab tokens
💬 Communication — Slack tokens & webhooks, Discord webhooks
💳 Payment & SaaS — Stripe, Twilio, SendGrid, Mailgun keys
🗄️ Database — MongoDB, PostgreSQL, MySQL, Redis, AMQP connection strings
🔐 Cryptographic — PEM private keys, JWT tokens
🔑 Generic — API key assignments, quoted/unquoted secrets
⚠️ Merge conflict detection — Prevents committing unresolved conflicts

Add custom patterns or disable built-ins in your config:

custom_secret_patterns = ["CUSTOM_KEY_[a-zA-Z0-9]{32}"]
disabled_secret_patterns = ["Generic Secret (unquoted)"]

The default provider (Ollama) runs entirely on your machine. No data leaves your network unless you explicitly configure a cloud provider.

🧪 Testing

cargo test   # 440 tests — unit, snapshot (insta), property (proptest), integration (wiremock)

See Testing Strategy for the full breakdown.

🗺️ Changelog

See CHANGELOG.md for the full version history.

Current: v0.6.0 Semantic Intelligence — Interactive refine/edit UX, native clipboard support, full AST diffs (structs, enums, classes, interfaces, traits), parent scope extraction, import change detection, doc-vs-code distinction, test file correlation, test-to-code ratio inference, change intent detection with confidence scoring, semantic markers (unsafe/derive/decorators/exports), token-budget rebalance, and 442 total tests.

🤝 Contributing

Contributions are welcome! By contributing, you agree to the Contributor License Agreement — you'll be asked to sign it when you open your first pull request.

💛 Sponsor

If you find CommitBee useful, consider sponsoring my work — it helps keep the project going.

📄 License

This project is dual-licensed under AGPL-3.0-only and a commercial license. See LICENSE for details.

REUSE compliant — every file carries SPDX headers.

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
.github		.github
LICENSES		LICENSES
fuzz		fuzz
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLA.md		CLA.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DOCS.md		DOCS.md
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
REUSE.toml		REUSE.toml
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🐝 CommitBee

✨ What Sets CommitBee Apart

🌳 It reads your code, not just your diffs

🧠 It reasons about what changed

✅ It validates and corrects its own output

🔀 It splits multi-concern commits

The pipeline

And there's more

📦 Installation

From source

Build from repository

Requirements

🚀 Quick Start

📖 Documentation

🔧 Configuration

💻 Usage

Options

Commands

🔒 Security

🧪 Testing

🗺️ Changelog

🤝 Contributing

💛 Sponsor

📄 License

About

Uh oh!

Releases 7

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🐝 CommitBee

✨ What Sets CommitBee Apart

🌳 It reads your code, not just your diffs

🧠 It reasons about what changed

✅ It validates and corrects its own output

🔀 It splits multi-concern commits

The pipeline

And there's more

📦 Installation

From source

Build from repository

Requirements

🚀 Quick Start

📖 Documentation

🔧 Configuration

💻 Usage

Options

Commands

🔒 Security

🧪 Testing

🗺️ Changelog

🤝 Contributing

💛 Sponsor

📄 License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages