feat: extend nexus package to support registration of benchmark package and benchmarks

## Description

We need nexus package/repo to support the following: 

1. Registering benchmark experiment packages
2. Registering  benchmarks on a per model basis

Registering means (a) putting information in correct location (b) validating that information (c) merging to main. 

see [benchmark system requirements](https://github.com/IBM/algorithm-nexus/blob/main/docs/requirements/benchmark.md) for terminology

## Motivation

These requirements are set out in  the [benchmarking requirements](https://github.com/IBM/algorithm-nexus/blob/main/docs/requirements/benchmark.md)

## Proposed Solution

Describe the solution you'd like to see implemented.

See [benchmark system design] for more details. 

1. Implement mechanism to register benchmark packages with nexus
    a. The  nexus.yml has a section that lists benchmark packages it requires with the experiments. This is a list of package-names (pypi), GitHub repo URLs or relative file-paths (python packages in repo) OR this is just a file called requirements_text.txt
    b. By some mechanism its checked that the list of benchmark packages can be installed together 
2. Implement mechanism to register benchmarks with nexus
      a.  Extend the nexus model dir spec to include sub-dir for bencmarks
             - Each benchmark is in its own dir and at minimum is an ado space.yaml 
            - Optional: Extend nexus model.yaml with the names/locations of these benchmarks
      b.  Extend nexus cli so it can validate the package structure
      c.  By some mechanism check that the benchmark space.yaml is valid

The mechanisms for 1b and 2c must be related as 1b is required for 2c.

Simple mechanism:  

1. `nexus validate package NAME
    - uv pip install -r packages/NAME/requirements_test.txt -> If can't be installed benchmark package registration fails AND benchmark registration is not possible
    - for each space.yml under packages/NAME/
            - ado create -f space space.yaml --dry-run -> If any validation fails you can't register the benchmark

## Additional Context

A key point is that ado provides mechanism for reading benchmark packages and finding experiments - however this relies on python entrypoints (how ado discovers the packages), which relies on the package being installed. There is no external list of experiments that can be read as this list can be dynamically generated from the decorator of experiments or by an actuator. 

The issue is the package installation - to list all experiment in all nexus packages, they all must installed, and hence not have dependency conflicts. 

Similarly validating a space using a particular experiment requires that particular experiment is installed. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: extend nexus package to support registration of benchmark package and benchmarks #68

Description

Motivation

Proposed Solution

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: extend nexus package to support registration of benchmark package and benchmarks #68

Description

Description

Motivation

Proposed Solution

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions