-
Notifications
You must be signed in to change notification settings - Fork 14k
automate gpu offloading - part 1 #149170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
automate gpu offloading - part 1 #149170
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2a3a561 to
80e8fce
Compare
|
|
|
The rustc-dev-guide subtree was changed. If this PR only touches the dev guide consider submitting a PR directly to rust-lang/rustc-dev-guide otherwise thank you for updating the dev guide with your changes. |
This comment has been minimized.
This comment has been minimized.
95a0037 to
519499d
Compare
519499d to
88ca3bc
Compare
| OS1.flush(); | ||
| auto MB = llvm::MemoryBuffer::getMemBufferCopy(Storage, "module.bc"); | ||
|
|
||
| SmallVector<char, 1024> BinaryData; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This buffer could be an argument provided by rustc and then rustc can do the file writing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is, that the C++ version is resizable. If we provide a buffer from rust, it wouldn't be.
I asked and there is no reasonable default size, so we'd pass a (likely) too-small buffer in, set the needed length and return false, see that in rust, allocate a larger buffer with the requested size, call the method again, and hope that it now passes.
It's just 3 lines extra on the Rust side, and I don't expect it to become a compile-time bottleneck, since no one (famous last words) will compile >10k kernels, but it still feels ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked deeper into the next steps, and I think it's probably not worth cleaning up this write, since it will be fused with the next step, where we consume the in-memory host.out file.
I'll implement a save-temps equivalent later for debugging where we'll still write it out, but then we can handle all intermediate writes at once.
Similar to my offload frontend, which Marcello also just rewrote after we figured out, what we actually need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wip #149202
|
@bors r+ rollup |
…r=oli-obk automate gpu offloading - part 1 Automates step 1 from the rustc-dev-guide offload section: https://rustc-dev-guide.rust-lang.org/offload/usage.html#compile-instructions `"clang-offload-packager" "-o" "host.out" "--image=file=device.bc,triple=amdgcn-amd-amdhsa,arch=gfx90a,kind=openmp"` Verified on an MI 250X cc `@jhuber6,` `@kevinsala,` `@jdoerfert,` `@Sa4dUs` r? oli-obk
…r=oli-obk automate gpu offloading - part 1 Automates step 1 from the rustc-dev-guide offload section: https://rustc-dev-guide.rust-lang.org/offload/usage.html#compile-instructions `"clang-offload-packager" "-o" "host.out" "--image=file=device.bc,triple=amdgcn-amd-amdhsa,arch=gfx90a,kind=openmp"` Verified on an MI 250X cc ``@jhuber6,`` ``@kevinsala,`` ``@jdoerfert,`` ``@Sa4dUs`` r? oli-obk
Rollup of 8 pull requests Successful merges: - #147536 (Add `rust-mingw` component for `*-windows-gnullvm` hosts) - #148407 (Warn against calls which mutate an interior mutable `const`-item) - #149168 (Fix ICE when collecting opaques from trait method declarations) - #149170 (automate gpu offloading - part 1) - #149180 (Couple of refactors to SharedEmitter) - #149185 (Handle cycles when checking impl candidates for `doc(hidden)`) - #149194 (Move safe computation out of unsafe block) - #149204 (Fix typo in HashMap performance comment) r? `@ghost` `@rustbot` modify labels: rollup
…r=oli-obk automate gpu offloading - part 1 Automates step 1 from the rustc-dev-guide offload section: https://rustc-dev-guide.rust-lang.org/offload/usage.html#compile-instructions `"clang-offload-packager" "-o" "host.out" "--image=file=device.bc,triple=amdgcn-amd-amdhsa,arch=gfx90a,kind=openmp"` Verified on an MI 250X cc ```@jhuber6,``` ```@kevinsala,``` ```@jdoerfert,``` ```@Sa4dUs``` r? oli-obk
Rollup of 7 pull requests Successful merges: - #147536 (Add `rust-mingw` component for `*-windows-gnullvm` hosts) - #148407 (Warn against calls which mutate an interior mutable `const`-item) - #149168 (Fix ICE when collecting opaques from trait method declarations) - #149170 (automate gpu offloading - part 1) - #149185 (Handle cycles when checking impl candidates for `doc(hidden)`) - #149194 (Move safe computation out of unsafe block) - #149204 (Fix typo in HashMap performance comment) r? `@ghost` `@rustbot` modify labels: rollup
…r=oli-obk automate gpu offloading - part 1 Automates step 1 from the rustc-dev-guide offload section: https://rustc-dev-guide.rust-lang.org/offload/usage.html#compile-instructions `"clang-offload-packager" "-o" "host.out" "--image=file=device.bc,triple=amdgcn-amd-amdhsa,arch=gfx90a,kind=openmp"` Verified on an MI 250X cc ````@jhuber6,```` ````@kevinsala,```` ````@jdoerfert,```` ````@Sa4dUs```` r? oli-obk
|
@bors2 try jobs=dist-ohos-aarch64 |
automate gpu offloading - part 1 try-job: dist-ohos-aarch64
|
⌛ Trying commit 88ca3bc with merge a278658… To cancel the try build, run the command Workflow: https://github.com/rust-lang/rust/actions/runs/19599008472 |
Automates step 1 from the rustc-dev-guide offload section:
https://rustc-dev-guide.rust-lang.org/offload/usage.html#compile-instructions
"clang-offload-packager" "-o" "host.out" "--image=file=device.bc,triple=amdgcn-amd-amdhsa,arch=gfx90a,kind=openmp"Verified on an MI 250X
cc @jhuber6, @kevinsala, @jdoerfert, @Sa4dUs
r? oli-obk