[executorch][nvidia][tensorrt][25/n] Share CUDA stream across TRT delegates#17936
[executorch][nvidia][tensorrt][25/n] Share CUDA stream across TRT delegates#17936shoumikhin wants to merge 1 commit intogh/shoumikhin/50/basefrom
Conversation
…egates Share a single CUDA stream across all TensorRT delegate instances instead of creating a per-delegate stream. This improves performance for serialized execution (the common case) by eliminating synchronization overhead between subgraphs. Internal: Addresses feedback from D93275039. Differential Revision: [D93778115](https://our.internmc.facebook.com/intern/diff/D93778115/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17936
Note: Links to docs will display an error until the docs builds have been completed. ❌ 51 New Failures, 7 Unrelated FailuresAs of commit 8938dec with merge base 01d21fa ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…egates Share a single CUDA stream across all TensorRT delegate instances instead of creating a per-delegate stream. This improves performance for serialized execution (the common case) by eliminating synchronization overhead between subgraphs. Internal: Addresses feedback from D93275039. Differential Revision: [D93778115](https://our.internmc.facebook.com/intern/diff/D93778115/) ghstack-source-id: 348044020 Pull Request resolved: #17936
This PR needs a
|
Stack from ghstack (oldest at bottom):
Share a single CUDA stream across all TensorRT delegate instances instead of creating a per-delegate stream. This improves performance for serialized execution (the common case) by eliminating synchronization overhead between subgraphs.
Internal:
Addresses feedback from D93275039.
Differential Revision: D93778115