Add fallback async IO support.#163
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements fallback async I/O support for the hipFile library, providing an alternative implementation when hardware-accelerated async I/O is not available. The implementation uses CPU bounce buffers and GPU kernels to transfer data between device memory and files.
Changes:
- Added fallback async I/O implementation using host functions and GPU kernels for data transfer
- Implemented hipFileReadAsync and hipFileWriteAsync API functions
- Added comprehensive system and unit tests for async I/O operations
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| test/system/async.cpp | New comprehensive system tests for async I/O functionality |
| test/amd_detail/async.cpp | Added unit tests for async I/O operations and fallback implementation |
| test/amd_detail/stream.cpp | Updated tests to mock new hipStreamGetDevice calls |
| test/amd_detail/mstream.h | Added mock methods for device ID and locking |
| test/amd_detail/mhip.h | Added mock methods for kernel launch and device attributes |
| test/amd_detail/masyncmonitor.h | New mock for AsyncMonitor |
| test/CMakeLists.txt | Added async system test to build |
| src/amd_detail/util.h | New utility functions including size literals and variant helpers |
| src/amd_detail/stream.h/cpp | Extended stream to track device ID and provide mutex locking |
| src/amd_detail/io.h/cpp | Added parameter validation helper |
| src/amd_detail/hipfile.cpp | Implemented hipFileReadAsync and hipFileWriteAsync APIs |
| src/amd_detail/hip.h/cpp | Added HIP API wrappers for kernel launch and device queries |
| src/amd_detail/backend/memcpy-kernel.hip | GPU kernel for async memory copy operations |
| src/amd_detail/backend/memcpy-kernel.h | Kernel header |
| src/amd_detail/backend/fallback.h/cpp | Core async I/O fallback implementation |
| src/amd_detail/backend/asyncop-fallback.h/cpp | Added submitted_size tracking |
| src/amd_detail/async.h | Made AsyncMonitor methods virtual for testing |
| src/amd_detail/CMakeLists.txt | Added new source files to build |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cd47d0d to
915826a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
915826a to
b568543
Compare
b568543 to
9033c41
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d2d0279 to
e54bee8
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e54bee8 to
fa04a29
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Will be used to make sure that the size hasn't increased when async parameters are bound.
Lock will be taken while adding operations to the stream for each IO so that each IO is queued sequentially.
Will be used to verify GPU buffer belongs to same GPU as the stream being used.
Copies data between CPU bounce buffer and GPU buffer.
Each IO will have several functions enqueued into the stream.
1. Host function binding IO parameters (if any parameter not fixed).
If read:
2. Host function reading from file to CPU bounce buffer.
3. GPU function reading from CPU bounce buffer into GPU buffer.
If write:
2. GPU function writing from GPU buffer into CPU bounce buffer.
3. Host function writing from CPU bounce buffer to file.
4. Host function to cleanup completed IO data.
review: Move hipStreamGetDevice call to Stream constructor.
review: Fix error in paramsValid check and use it in fallback and fastpath io functions.
review: Remove blockIdx.x and document that we expect a single block.
review: Add EINTR handling. Remove impossible HIP error path.
review: Remove test for impossible HIP path.
review: Can use same buffer_offset and buffer length for too large offset test.
05fc459 to
b50d9b3
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This PR adds fallback async IO support to hipFile. IO going through this path uses a pinned CPU bounce buffer to copy data between the disk and GPU memory. Multiple operations will be queued on the stream in order to carry out the IO.
Read:
Write:
AIHIPFILE-67