PLTF-2046: add test-improvement-workflow skills by aivong-openhands · Pull Request #225 · OpenHands/extensions

aivong-openhands · 2026-05-01T03:43:43Z

Summary

Adds skills/test-improvement-workflow/SKILL.md as a concise top-level orchestrator for improving test suites with Dave Farley's 8 properties
Splits the detailed guidance into three focused reference skills:
- skills/test-prioritization-framework/SKILL.md
- skills/test-validation-checklist/SKILL.md
- skills/test-improvement-patterns/SKILL.md
Registers all four skills in marketplaces/openhands-extensions.json under the testing category
Runs sync_extensions.py to update the README catalog (51 total extensions, 42 skills)

Skill Overview

The test-improvement-workflow skill now stays small and progressive-disclosure-friendly by delegating deeper guidance to three focused skills:

Audit - Run test-design-reviewer to get a Farley Score
Prioritize - Open test-prioritization-framework to classify issues into CRITICAL / HIGH / MEDIUM tiers
Present - Always show the prioritized improvements table, even for exemplary suites
Validate - Open test-validation-checklist to verify each improvement addresses a real code issue
Plan - Assign the right companion skill to each approved task
Execute - Open test-improvement-patterns for safe execution loops and common fix patterns
Verify - Re-audit in a new conversation for unbiased evaluation

Triggers on: test improvement workflow, improve test quality, refactor tests, audit tests and fix them

Test plan

uv sync --group test
uv run python scripts/sync_extensions.py --check
uv run pytest -q tests/test_sync_extensions.py tests/test_skill_plugin_loading.py tests/test_skills_have_readme.py tests/test_workflow_sync.py
Each new skill has README.md, .plugin/plugin.json, and vendor symlinks
The main orchestrator skill is reduced from 603 lines to 103 lines

This PR description was updated by an AI agent (OpenHands) on behalf of aivong-openhands.

Adds a systematic 7-step workflow for improving test suite quality using Dave Farley's 8 properties of good tests. Orchestrates the test-design-reviewer, tdd, testing, and refactoring skills into a complete audit-prioritize-validate-implement-verify cycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Updated the description to reference Paul Hammond's skills and added a link to the individual skills.

Updated the description to reference Paul Hammond's test-design-reviewer instead of Dave Farley's.

aivong-openhands · 2026-05-01T03:49:28Z

@OpenHands follow up on failing PR checks

openhands-ai · 2026-05-01T03:49:41Z

I'm on it! aivong-openhands can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

aivong-openhands · 2026-05-01T03:53:41Z

Followed up on the failing checks and pushed fix commit 8ac0f2d to add-test-improvement-workflow.

What I fixed:

added the missing skills/test-improvement-workflow/README.md
added skills/test-improvement-workflow/.plugin/plugin.json
ran scripts/sync_extensions.py to refresh the README catalog and generate the expected plugin symlinks

Verification:

uv run --group test python scripts/sync_extensions.py --check
uv run --group test pytest tests/ -q
GitHub PR checks are now all passing

This comment was created by an AI agent (OpenHands) on behalf of the user.

openhands-ai · 2026-05-01T03:54:22Z

Final summary of new work since my last update:

Completed the PR follow-up fully.
Confirmed the fix commit 8ac0f2d was pushed to add-test-improvement-workflow.
Waited for GitHub Actions and verified all PR checks passed.
Corrected the malformed PR comment and updated it with a clean summary of the fixes.
Confirmed the working tree is clean afterward.

Conciseness check:

No additional code changes were made beyond what was needed.
The only post-fix action was repairing the PR comment text; no extraneous repository changes were introduced.

all-hands-bot

🔴 Needs improvement - Violates fundamental principles

This skill has solid content and addresses a real need (test quality improvement), but violates two core repository principles: it depends on skills that don't exist in this repo, and it violates the progressive disclosure principle at 590 lines.

See inline comments for details.

all-hands-bot · 2026-05-01T04:03:36Z

Review Summary

[CRITICAL ISSUES] (Must fix - these break fundamental principles)

[skills/test-improvement-workflow/SKILL.md, Line 28] Broken Dependencies: References four skills from an external repository (Paul Hammond's dotfiles) that don't exist in OpenHands/extensions. The skill cannot function without these prerequisites.
[skills/test-improvement-workflow/SKILL.md, Lines 18-23] Invalid Dependencies: The frontmatter declares dependencies on test-design-reviewer, tdd, refactoring, and testing - none of which exist in this repository.

[IMPROVEMENT OPPORTUNITIES] (Should fix - violates good taste)

[skills/test-improvement-workflow/SKILL.md, Overall] Progressive Disclosure Violation: At 590 lines, this violates the AgentSkills principle of progressive disclosure. The skill should be split into:
- Main orchestrator (~50-100 lines)
- Separate reference skills for prioritization framework, validation checklist, and improvement patterns

[RISK ASSESSMENT]

⚠️ Risk Assessment: 🟡 MEDIUM

This PR adds a new skill without modifying existing functionality, so blast radius is limited. However, it introduces broken dependencies that will prevent the skill from working, and violates architectural principles around skill design (progressive disclosure, composability).

Risk factors:

Pattern conformance: Violates the progressive disclosure pattern documented in AGENTS.md
Dependency integrity: Creates broken references to external skills not in this repository
Blast radius: Low - isolated to new skill addition, doesn't affect existing extensions

Recommendation: Address the critical dependency issues before merging. Either import the dependent skills or make this skill self-contained/properly documented as requiring external dependencies.

VERDICT:

❌ Needs rework: Fundamental design issues must be addressed first

KEY INSIGHT:

This skill attempts to orchestrate external dependencies that don't exist in the repository, creating a broken user experience where the skill appears available but cannot function.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

…with installation instructions Removes broken `dependencies` from frontmatter (the four prerequisite skills live in citypaul/.dotfiles, not this repo) and replaces the bare prerequisites list with a dedicated Installation section that links to the upstream source and provides a copy/paste shell snippet. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟢 Good taste - Elegant solution to the progressive disclosure problem.

Successfully transformed an oversized 603-line skill into a clean orchestrator pattern (103 lines) that delegates to three focused reference skills. External dependencies (Paul Hammond's skills) are properly documented with source URLs rather than bundled.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW
- Documentation/content change adding new skills
- No breaking changes to existing functionality
- External dependencies clearly documented as prerequisites
- All required files present (README.md, .plugin/plugin.json, symlinks)
- Test plan shows successful validation

VERDICT:
✅ Worth merging - Addresses previous critical feedback and follows all repository conventions

KEY INSIGHT:
This is exactly what progressive disclosure should look like - a concise entry point that delegates to focused reference materials when deeper guidance is needed.

xingyaoww

@aivong-openhands I wonder if we can just create ONE plugin that bundles all these skills in it? 👀

Co-authored-by: openhands <openhands@all-hands.dev>

aivong-openhands · 2026-05-01T15:22:30Z

@aivong-openhands I wonder if we can just create ONE plugin that bundles all these skills in it? 👀

What is the difference between a plugin and a skill in this repo?

aivong-openhands · 2026-05-09T02:20:15Z

@OpenHands resolve merge conflicts

openhands-ai · 2026-05-09T02:20:27Z

I'm on it! aivong-openhands can track my progress at all-hands.dev

- resolve README.md and AGENTS.md merge conflicts\n- regenerate README catalog via sync_extensions.py\n- fix load_project_skills imports for released SDK compatibility\n\nCo-authored-by: openhands <openhands@all-hands.dev>

- regenerate the OpenHands SDK skill from docs\n- update sdk loading tests to use current marketplace/skills modules\n\nCo-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-05-09T02:28:50Z

aivong-openhands · 2026-05-09T02:34:40Z

@OpenHands What is the difference between a plugin and a skill in this repo?

openhands-ai · 2026-05-09T02:34:52Z

I'm on it! aivong-openhands can track my progress at all-hands.dev

aivong-openhands · 2026-05-09T02:35:50Z

@aivong-openhands In this repo, the practical distinction is:

Type	What it is	Typical contents	When to use it
Skill	Reusable agent guidance / knowledge	`SKILL.md`, optional `README.md`, references	Use when the extension is mostly instructions, heuristics, checklists, or workflow guidance
Plugin	A packaged extension that can include executable behavior	`SKILL.md` plus `scripts/`, `hooks/`, `commands/`, and sometimes bundled `skills/`	Use when you need code, lifecycle hooks, slash commands, API calls, or a higher-level package that orchestrates behavior

A couple of repo-specific nuances:

Many entries under skills/ also ship a .plugin/plugin.json manifest plus vendor symlinks so tools like Codex / Claude Code can load them. That does not make them a repo-level plugin; they are still conceptually skills.
A plugin can bundle one or more supporting skills. plugins/openhands is an example of that pattern.

So for this PR specifically: these new additions are mostly guidance content, and splitting them into separate skills keeps them individually discoverable and progressive-disclosure-friendly.

If we wanted a single top-level package that exposed commands and orchestrated several helper skills behind the scenes, then one plugin bundling multiple skills would be a reasonable design too.

This comment was created by an AI agent (OpenHands) on behalf of aivong-openhands.

openhands-ai · 2026-05-09T02:36:33Z

Final summary of new work since the last summary:

Fully addressed the new PR comment on PLTF-2046: add test-improvement-workflow skills #225.
Reviewed the repo context and PR context before responding:
- confirmed the PR is #225 on branch add-test-improvement-workflow
- checked the repo docs (AGENTS.md, README.md) for the repo-specific distinction between skills and plugins
- fetched PR metadata and recent comments from the GitHub API for context
Posted a reply on the PR explaining the distinction:
- skill = reusable guidance / instructions / workflow knowledge
- plugin = packaged extension that can include executable behavior like scripts, hooks, commands, and can bundle skills
- noted the repo nuance that some skills/ also ship .plugin/plugin.json and vendor symlinks for loader compatibility, but they are still conceptually skills
- noted that a single plugin bundling multiple skills would be a valid design if executable orchestration were desired

Conciseness check:

No code changes were made.
No repository files were modified.
The work was limited to understanding context and answering the PR question.

Posted reply:
#225 (comment)

aivong-openhands · 2026-05-09T02:37:58Z

@OpenHands create a plugin that bundles all of these skills

openhands-ai · 2026-05-09T02:38:11Z

I'm on it! aivong-openhands can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

aivong-openhands · 2026-05-09T02:44:07Z

Done - I added a bundled test-improvement plugin and pushed it to add-test-improvement-workflow in commit b98aec0.

What changed:

added plugins/test-improvement/ with a plugin manifest and README
bundled these skills via plugin skills/ symlinks:
- test-improvement-workflow
- test-prioritization-framework
- test-validation-checklist
- test-improvement-patterns
registered the new plugin in marketplaces/openhands-extensions.json
ran scripts/sync_extensions.py so the README catalog now includes the plugin

Validation:

uv run python scripts/sync_extensions.py --check
uv run pytest -q tests/test_sync_extensions.py tests/test_skill_plugin_loading.py tests/test_plugin_manifest.py tests/test_skills_have_readme.py tests/test_workflow_sync.py

This comment was created by an AI agent (OpenHands) on behalf of aivong-openhands.

openhands-ai · 2026-05-09T02:45:02Z

aivong-openhands and others added 5 commits April 30, 2026 22:42

Update SKILL.md to include author name

e8b8702

Revise SKILL.md to include Paul Hammond's reference

0f175b0

Updated the description to reference Paul Hammond's skills and added a link to the individual skills.

Fix formatting issue in SKILL.md

5cae181

Correct description in test-improvement-workflow

77a452e

Updated the description to reference Paul Hammond's test-design-reviewer instead of Dave Farley's.

fix: add missing test-improvement-workflow metadata

8ac0f2d

Co-authored-by: openhands <openhands@all-hands.dev>

aivong-openhands marked this pull request as ready for review May 1, 2026 04:00

aivong-openhands requested review from VascoSch92 and xingyaoww May 1, 2026 04:00

all-hands-bot reviewed May 1, 2026

View reviewed changes

Comment thread skills/test-improvement-workflow/SKILL.md

Comment thread skills/test-improvement-workflow/SKILL.md Outdated

Comment thread skills/test-improvement-workflow/SKILL.md Outdated

aivong-openhands and others added 2 commits April 30, 2026 23:44

refactor: split test improvement workflow skill

04acae5

Co-authored-by: openhands <openhands@all-hands.dev>

aivong-openhands requested a review from all-hands-bot May 1, 2026 05:03

aivong-openhands changed the title ~~feat: add test-improvement-workflow skill~~ feat: add test-improvement-workflow skills May 1, 2026

all-hands-bot approved these changes May 1, 2026

View reviewed changes

aivong-openhands changed the title ~~feat: add test-improvement-workflow skills~~ PLTF-2046: add test-improvement-workflow skills May 1, 2026

xingyaoww reviewed May 1, 2026

View reviewed changes

Comment thread marketplaces/openhands-extensions.json Outdated

xingyaoww reviewed May 1, 2026

View reviewed changes

fix: use hyphens in marketplace descriptions

3ead538

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-agent added 2 commits May 9, 2026 02:25

Merge origin/main into add-test-improvement-workflow

aea5523

- resolve README.md and AGENTS.md merge conflicts\n- regenerate README catalog via sync_extensions.py\n- fix load_project_skills imports for released SDK compatibility\n\nCo-authored-by: openhands <openhands@all-hands.dev>

fix: align SDK imports with released package

6588073

- regenerate the OpenHands SDK skill from docs\n- update sdk loading tests to use current marketplace/skills modules\n\nCo-authored-by: openhands <openhands@all-hands.dev>

aivong-openhands requested a review from xingyaoww May 9, 2026 02:30

Add bundled test-improvement plugin

b98aec0

Co-authored-by: openhands <openhands@all-hands.dev>

Conversation

aivong-openhands commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Skill Overview

Test plan

Uh oh!

aivong-openhands commented May 1, 2026

Uh oh!

openhands-ai Bot commented May 1, 2026

Uh oh!

aivong-openhands commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openhands-ai Bot commented May 1, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot commented May 1, 2026

Review Summary

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

aivong-openhands commented May 1, 2026

Uh oh!

aivong-openhands commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

aivong-openhands commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

aivong-openhands commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

aivong-openhands commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

aivong-openhands commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aivong-openhands commented May 1, 2026 •

edited

Loading

aivong-openhands commented May 1, 2026 •

edited

Loading