Skip to content

[ENH] V1 -> V2 Migration : Runs#1616

Open
Omswastik-11 wants to merge 315 commits into
openml:mainfrom
Omswastik-11:runs-migration-stacked
Open

[ENH] V1 -> V2 Migration : Runs#1616
Omswastik-11 wants to merge 315 commits into
openml:mainfrom
Omswastik-11:runs-migration-stacked

Conversation

@Omswastik-11
Copy link
Copy Markdown
Contributor

@Omswastik-11 Omswastik-11 commented Jan 15, 2026

Metadata

  • Reference Issue:
  • New Tests Added:
  • Documentation Updated:
  • Change Log Entry:

Details

fixes #1624

@geetu040 geetu040 mentioned this pull request Jan 15, 2026
18 tasks
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 15, 2026

Codecov Report

❌ Patch coverage is 59.01639% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.23%. Comparing base (1f6fed4) to head (f2363ed).

Files with missing lines Patch % Lines
openml/_api/resources/run.py 77.21% 18 Missing ⚠️
openml/_api/clients/http.py 19.04% 17 Missing ⚠️
openml/runs/run.py 31.57% 13 Missing ⚠️
openml/runs/functions.py 33.33% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1616       +/-   ##
===========================================
- Coverage   81.45%   55.23%   -26.23%     
===========================================
  Files          63       63               
  Lines        5124     5169       +45     
===========================================
- Hits         4174     2855     -1319     
- Misses        950     2314     +1364     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync with base pr
sdk code look good so far, please take a look at #1575 (comment) and make changes accordingly where needed.
all tests (existing and new) should pass to make sure we are retaining the original functionality of the sdk

Comment thread openml/_api/resources/runs.py Outdated
Comment thread openml/_api/resources/runs.py Outdated
Comment thread openml/_api/resources/runs.py Outdated
Comment thread openml/runs/functions.py Outdated
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
@Omswastik-11 Omswastik-11 requested a review from geetu040 January 30, 2026 09:50
@Omswastik-11 Omswastik-11 marked this pull request as ready for review January 30, 2026 09:50
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Comment thread openml/runs/functions.py Outdated
Comment on lines +822 to +828
use_cache = not ignore_cache
reset_cache = ignore_cache
return api_context.backend.runs.get(
run_id,
use_cache=use_cache,
reset_cache=reset_cache,
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_cache should be true since the method always supports caching
reset_cache should rely on ignore_cache

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
@Omswastik-11
Copy link
Copy Markdown
Contributor Author

Omswastik-11 commented May 2, 2026

Hi @geetu040 !! Can you check the replies to reviews and the failing tests and let me know if any changes needed ?

Comment thread openml/_api/resources/run.py Outdated
Comment on lines +32 to +41
def test_run_v1_get(run_v1, with_test_cache):
try:
run = run_v1.get(run_id=1)
except OpenMLServerException as e:
if e.code == 236 or "Run not found" in str(e):
run = run_v1.get(run_id=25)
else:
raise
_assert_run_shape(run)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why simply run_v1.get(run_id=1) doesn't work?

Copy link
Copy Markdown
Contributor Author

@Omswastik-11 Omswastik-11 May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Run Id of 1 is not present in test server and run Id of 25 is not present in local server . It is bit weird I tried Finding common ID Manually but couldn't find it .

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I remember it now, I mentioned this to @PGijsbers on slack here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we have a way to check whether a local or non-local server configured is being used?
Then I would prefer to use that e.g.,

run_id = 25 if openml config is local else 1

That embeds this knowledge into the code so it's clear for future maintainers.
We probably do not have the time to address this on our end for a while longer :(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could actually use the env var OPENML_USE_LOCAL_SERVICES, that should be fine

Copilot AI review requested due to automatic review settings May 11, 2026 12:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Comment thread openml/_api/clients/http.py
Comment thread openml/_api/clients/http.py
Comment thread openml/_api/resources/run.py Outdated
Comment thread openml/_api/resources/base/resources.py
Comment thread tests/test_api/test_run.py
@Omswastik-11 Omswastik-11 requested a review from geetu040 May 11, 2026 12:47
Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done @Omswastik-11.

@PGijsbers, could you please review/merge this PR when you get a chance?

There is currently one issue caused by differences between the test-server and local-server database entities, which is temporarily patched here: #1616 (comment).

I had mentioned this earlier on Slack as well here, we can continue discussion there

Copy link
Copy Markdown
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small changes or clarifications requested, please see comments.

path_parts = parsed_url.path.strip("/").split("/")

filtered_params = {k: v for k, v in params.items() if k != "api_key"}
params_part = [urlencode(filtered_params)] if filtered_params else []
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good remark, but seeing as this code isn't touched by this PR, I would advocate fixing this in a separate PR.

Comment on lines +102 to +109
if response.content.startswith(b"PK\x03\x04"):
return "body.zip"

try:
arff.loads(response.text)
return "body.arff"
except arff.ArffException:
pass
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no HTTP header data that would allow us to tell what the content (and file name) should be?
Otherwise, at least for ARFF, the spec states that the first non-comment line of the file should be (not case sensitive): @relation <relation name>. So we could look for that instead of parsing the entire file content.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no HTTP header data that would allow us to tell what the content (and file name) should be?

I tried but didn't find anything

Otherwise, at least for ARFF, the spec states that the first non-comment line of the file should be (not case sensitive): @relation <relation name>.

sounds good I could give this a try

OpenMLHashException
If checksum verification fails.
"""
url = urljoin(self.server, path)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore: If this isn't the case already, this should be normalized when openml.config.server is set, not each site which uses it.

Comment thread openml/_api/clients/http.py Outdated
Comment on lines +598 to +602
if use_api_key:
params["api_key"] = self.api_key

if method.upper() in {"POST", "PUT", "PATCH"}:
data = {**params, **data}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore: It raises an exception if api_key is None, it's the statement preceding this line..

Comment on lines +102 to +106
self,
limit: int,
offset: int,
*,
ids: builtins.list[int] | None = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please address or explain; i see you have dismissed previous comments about this so presumably there is a reason?

Comment thread openml/_api/resources/run.py Outdated

# Fall back to generic oml:id (used by other resources)
if "oml:id" in root_value:
return int(root_value["oml:id"])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If run responses always return oml:run_id, when do we expect this code path to be correct to run?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omswastik-11 since this method is overriden for runs, we shouldn't expect to handle other resources here, therefore logically this path should be unreachable as Pieter has said

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah got it. I removed it.

Comment on lines +32 to +41
def test_run_v1_get(run_v1, with_test_cache):
try:
run = run_v1.get(run_id=1)
except OpenMLServerException as e:
if e.code == 236 or "Run not found" in str(e):
run = run_v1.get(run_id=25)
else:
raise
_assert_run_shape(run)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we have a way to check whether a local or non-local server configured is being used?
Then I would prefer to use that e.g.,

run_id = 25 if openml config is local else 1

That embeds this knowledge into the code so it's clear for future maintainers.
We probably do not have the time to address this on our end for a while longer :(

@geetu040
Copy link
Copy Markdown
Collaborator

@Omswastik-11 could you go through the above comments, we'd need to close these discussions.

Copilot AI review requested due to automatic review settings May 22, 2026 12:58
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

openml/runs/run.py:373

  • The ObjectNotPublishedError message here diverges from the established tagging error message used by OpenMLBase.remove_tag (via openml.utils._tag_openml_base), and it also drops the object context. Consider reusing the same wording/format for consistency across entity types.
        if self.run_id is None:
            raise openml.exceptions.ObjectNotPublishedError(
                "Cannot untag a run that has not been published yet."
                " Please publish the run first before being able to untag it.",
            )

Comment thread openml/_api/resources/base/resources.py
Comment thread tests/test_api/test_run.py Outdated
Comment thread openml/runs/run.py
Comment thread openml/_api/clients/http.py
Co-authored-by: Pieter Gijsbers <p.gijsbers@tue.nl>
Copilot AI review requested due to automatic review settings May 22, 2026 13:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Comment thread openml/_api/resources/base/resources.py
Comment thread openml/_api/resources/run.py Outdated
Comment thread openml/_api/clients/http.py
Comment thread openml/_api/clients/http.py
Comment thread tests/test_api/test_run.py
Copilot AI review requested due to automatic review settings May 22, 2026 13:32
@Omswastik-11 Omswastik-11 requested a review from PGijsbers May 22, 2026 13:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Comment on lines +119 to 125
if len(candidates) > 1:
raise FileNotFoundError(
f"Multiple body files found in path: {path} ({[p.name for p in candidates]})"
)

return candidates[0].name

Comment on lines +32 to +38
def test_run_v1_get(run_v1, with_test_cache):
import os

# Run 1 exists on the remote test server; the local docker server only seeds run 25.
run_id = 25 if os.getenv("OPENML_USE_LOCAL_SERVICES") == "true" else 1
run = run_v1.get(run_id=run_id)
_assert_run_shape(run)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] V1 → V2 API Migration - runs

9 participants