Skip to content

[Feature Proposal] Implementation of Keypoint Detection Head (Architecture & Loss) #468

@Kagura-Ahad

Description

@Kagura-Ahad

Search before asking

  • I have searched the RF-DETR issues and found no similar feature requests.

Description

Following up on discussions in #418 and #169, and recent conversations regarding the roadmap for Keypoint support.
I understand that full pre-training for a pose model is computationally heavy and currently a lower priority than OBB (#56). However, I would like to contribute by implementing the architectural components as of right now, so the model is "pose-ready" when the team decides to allocate compute resources for training.

Proposed Implementation Plan:

  1. Model Architecture: Extend the Transformer Decoder to support a Keypoint Head (likely an MLP predicting K points per query).
  2. Matcher: Update the Hungarian Matcher to include Keypoint cost (using OKS - Object Keypoint Similarity).
  3. Loss Function: Implement the OKS Loss calculation for the training loop.
  4. Data Pipeline: Ensure the dataloader structure can accept COCO-Keypoint format annotations.

Use case

The Keypoint support is on developers' roadmap

Additional

Questions for the Team

  • Do you prefer I mirror the implementation style of RT-DETR / YOLOv8-Pose for the head?
  • Should this live in a specific feature branch (e.g., feat/keypoints)?
    I am happy to open a PR with the structural changes + unit tests verifying the forward pass/loss calculation works on dummy data.

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions