[ENH] V1 -> V2 Migration : Runs#1616
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1616 +/- ##
===========================================
- Coverage 81.45% 55.23% -26.23%
===========================================
Files 63 63
Lines 5124 5169 +45
===========================================
- Hits 4174 2855 -1319
- Misses 950 2314 +1364 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…into runs-migration-stacked
geetu040
left a comment
There was a problem hiding this comment.
sync with base pr
sdk code look good so far, please take a look at #1575 (comment) and make changes accordingly where needed.
all tests (existing and new) should pass to make sure we are retaining the original functionality of the sdk
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…into runs-migration-stacked
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…-11/openml-python into runs-migration-stacked
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
| use_cache = not ignore_cache | ||
| reset_cache = ignore_cache | ||
| return api_context.backend.runs.get( | ||
| run_id, | ||
| use_cache=use_cache, | ||
| reset_cache=reset_cache, | ||
| ) |
There was a problem hiding this comment.
use_cache should be true since the method always supports caching
reset_cache should rely on ignore_cache
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
|
Hi @geetu040 !! Can you check the replies to reviews and the failing tests and let me know if any changes needed ? |
| def test_run_v1_get(run_v1, with_test_cache): | ||
| try: | ||
| run = run_v1.get(run_id=1) | ||
| except OpenMLServerException as e: | ||
| if e.code == 236 or "Run not found" in str(e): | ||
| run = run_v1.get(run_id=25) | ||
| else: | ||
| raise | ||
| _assert_run_shape(run) | ||
|
|
There was a problem hiding this comment.
why simply run_v1.get(run_id=1) doesn't work?
There was a problem hiding this comment.
Because Run Id of 1 is not present in test server and run Id of 25 is not present in local server . It is bit weird I tried Finding common ID Manually but couldn't find it .
There was a problem hiding this comment.
Okay, I remember it now, I mentioned this to @PGijsbers on slack here.
There was a problem hiding this comment.
Didn't we have a way to check whether a local or non-local server configured is being used?
Then I would prefer to use that e.g.,
run_id = 25 if openml config is local else 1
That embeds this knowledge into the code so it's clear for future maintainers.
We probably do not have the time to address this on our end for a while longer :(
There was a problem hiding this comment.
we could actually use the env var OPENML_USE_LOCAL_SERVICES, that should be fine
geetu040
left a comment
There was a problem hiding this comment.
Nicely done @Omswastik-11.
@PGijsbers, could you please review/merge this PR when you get a chance?
There is currently one issue caused by differences between the test-server and local-server database entities, which is temporarily patched here: #1616 (comment).
I had mentioned this earlier on Slack as well here, we can continue discussion there
PGijsbers
left a comment
There was a problem hiding this comment.
small changes or clarifications requested, please see comments.
| path_parts = parsed_url.path.strip("/").split("/") | ||
|
|
||
| filtered_params = {k: v for k, v in params.items() if k != "api_key"} | ||
| params_part = [urlencode(filtered_params)] if filtered_params else [] |
There was a problem hiding this comment.
This is a good remark, but seeing as this code isn't touched by this PR, I would advocate fixing this in a separate PR.
| if response.content.startswith(b"PK\x03\x04"): | ||
| return "body.zip" | ||
|
|
||
| try: | ||
| arff.loads(response.text) | ||
| return "body.arff" | ||
| except arff.ArffException: | ||
| pass |
There was a problem hiding this comment.
Is there no HTTP header data that would allow us to tell what the content (and file name) should be?
Otherwise, at least for ARFF, the spec states that the first non-comment line of the file should be (not case sensitive): @relation <relation name>. So we could look for that instead of parsing the entire file content.
There was a problem hiding this comment.
Is there no HTTP header data that would allow us to tell what the content (and file name) should be?
I tried but didn't find anything
Otherwise, at least for ARFF, the spec states that the first non-comment line of the file should be (not case sensitive):
@relation <relation name>.
sounds good I could give this a try
| OpenMLHashException | ||
| If checksum verification fails. | ||
| """ | ||
| url = urljoin(self.server, path) |
There was a problem hiding this comment.
ignore: If this isn't the case already, this should be normalized when openml.config.server is set, not each site which uses it.
| if use_api_key: | ||
| params["api_key"] = self.api_key | ||
|
|
||
| if method.upper() in {"POST", "PUT", "PATCH"}: | ||
| data = {**params, **data} |
There was a problem hiding this comment.
ignore: It raises an exception if api_key is None, it's the statement preceding this line..
| self, | ||
| limit: int, | ||
| offset: int, | ||
| *, | ||
| ids: builtins.list[int] | None = None, |
There was a problem hiding this comment.
please address or explain; i see you have dismissed previous comments about this so presumably there is a reason?
|
|
||
| # Fall back to generic oml:id (used by other resources) | ||
| if "oml:id" in root_value: | ||
| return int(root_value["oml:id"]) |
There was a problem hiding this comment.
If run responses always return oml:run_id, when do we expect this code path to be correct to run?
There was a problem hiding this comment.
@Omswastik-11 since this method is overriden for runs, we shouldn't expect to handle other resources here, therefore logically this path should be unreachable as Pieter has said
There was a problem hiding this comment.
Yeah got it. I removed it.
| def test_run_v1_get(run_v1, with_test_cache): | ||
| try: | ||
| run = run_v1.get(run_id=1) | ||
| except OpenMLServerException as e: | ||
| if e.code == 236 or "Run not found" in str(e): | ||
| run = run_v1.get(run_id=25) | ||
| else: | ||
| raise | ||
| _assert_run_shape(run) | ||
|
|
There was a problem hiding this comment.
Didn't we have a way to check whether a local or non-local server configured is being used?
Then I would prefer to use that e.g.,
run_id = 25 if openml config is local else 1
That embeds this knowledge into the code so it's clear for future maintainers.
We probably do not have the time to address this on our end for a while longer :(
|
@Omswastik-11 could you go through the above comments, we'd need to close these discussions. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
openml/runs/run.py:373
- The
ObjectNotPublishedErrormessage here diverges from the established tagging error message used byOpenMLBase.remove_tag(viaopenml.utils._tag_openml_base), and it also drops the object context. Consider reusing the same wording/format for consistency across entity types.
if self.run_id is None:
raise openml.exceptions.ObjectNotPublishedError(
"Cannot untag a run that has not been published yet."
" Please publish the run first before being able to untag it.",
)
Co-authored-by: Pieter Gijsbers <p.gijsbers@tue.nl>
…-11/openml-python into runs-migration-stacked
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| if len(candidates) > 1: | ||
| raise FileNotFoundError( | ||
| f"Multiple body files found in path: {path} ({[p.name for p in candidates]})" | ||
| ) | ||
|
|
||
| return candidates[0].name | ||
|
|
| def test_run_v1_get(run_v1, with_test_cache): | ||
| import os | ||
|
|
||
| # Run 1 exists on the remote test server; the local docker server only seeds run 25. | ||
| run_id = 25 if os.getenv("OPENML_USE_LOCAL_SERVICES") == "true" else 1 | ||
| run = run_v1.get(run_id=run_id) | ||
| _assert_run_shape(run) |
Metadata
Details
fixes #1624