-
Notifications
You must be signed in to change notification settings - Fork 1k
Preserve partial downloads on network failure #4695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Preserve partial downloads on network failure #4695
Conversation
|
@FranciscoTGouveia Looks easy enough! I'll give it a test drive when I'm back with my machine :) |
rami3l
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for investigating into this!
OTOH I do think we need to be more careful about filtering the right kind of error since we do want to redownload if the cache is proven to be corrupted for example 🙏
c01dd03 to
235dabb
Compare
src/download/mod.rs
Outdated
| }; | ||
|
|
||
| let is_network_timeout = match err.downcast_ref::<DEK>() { | ||
| Some(DEK::Reqwest(e)) => e.is_timeout() || e.is_connect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: How did you perform the test and/or is there a way to stably demonstrate the effectiveness of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I performed it locally: during an installation, I dropped my internet connection, and after rustup toolchain install failed, I looked in .rustup/downloads for .partial files.
Where, before this patch, there were none, they were now left untouched.
To ensure this only happens when we are certain, I whitelisted the situations in which the removal should not be allowed (e.is_timeout() || e.is_connect()).
Would it be preferable to add a unit test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FranciscoTGouveia A test for this would be cool if that's fairly easy to do. That said, I'm not completely sure about the feasibility, so feel free to tell me if it turns out too difficult :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FranciscoTGouveia I have one remaining concern with this change.
When trying to reproduce your scenario above on my machine:
info: rolling back changes
error: component download failed for cargo-aarch64-apple-darwin: partially downloaded file may have been damaged and was removed, please try again: [..] client error (Connect) [..]... the .partial is actually saved which is okay. The transaction system is rolling back changes which is also okay. However the partially downloaded file may have been damaged and was removed part might be misleading. Have you encountered something similar and, do you have an idea of fixing that, by adjusting the wording or by adding a flag to be passed over? Many thanks again! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of extracting the is_network_failure to an external function and using it to determine which error message to throw, implying that I will create a new error message.
However, I can also simply delete the ...and was removed part from the current BrokenPartialFile error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is IncompletePartialFile a good error name?
I used a match statement here because I think it is the most readable approach.
Please let me know if you think I should change it.
235dabb to
1b5abc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo a minor wording change and some nits, nice work :)
When you are done, just squash the first 3 commits into one (they include some back-and-forth) and we are good to go.
| #[error("partially downloaded file may have been damaged and was removed, please try again")] | ||
| BrokenPartialFile, | ||
| #[error( | ||
| "download was only partially completed; the partial file was kept so it can be resumed, please try again" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make this even easier to read: "partially downloaded file was kept for resumption, please try again"
| let target_path = tmpdir.path().join("downloaded.partial"); | ||
| write_file(&target_path, "123"); | ||
|
|
||
| let listener = TcpListener::bind("127.0.0.1:1080").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: What is the point of this listener here? You don't seem to be using it anyway?
If it's just for the address, maybe just construct it directly, and probably use an explicitly invalid address such as 240.0.0.0:1080?
- let listener = TcpListener::bind("127.0.0.1:1080").unwrap();
- let addr = listener.local_addr().unwrap();
- drop(listener);
- let from_url = format!("http://{addr}").parse().unwrap();
-
+ let from_url = format!("http://240.0.0.0:1080").parse().unwrap();
Closes #4448.
If
resume_from_partialis set to true, it short-circuits, ensuring theremove_file(path)is not called.