-
Notifications
You must be signed in to change notification settings - Fork 62
Merge OpenAI Triton commit 99b5e29
#4219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When `TRITON_PRINT_AUTOTUNING=1`, we expect `self.bench_time` to be populated if we did not used a cached result for the benchmarking results. There was a codepath that used cached results from the disk, but did not update the flag saying we used cached results, leading to a crash when `self.bench_time` was unset. <!--- The core Triton is a small number of people, and we receive many PRs (thank you!). To help us review your code more quickly, **if you are a new contributor (less than 3 PRs merged) we ask that you complete the following tasks and include the filled-out checklist in your PR description.** Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> # New contributor declaration - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [x] This PR does not need a test because it should be handled by existing tests. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)
The newly-added autotune cache needs to bail out if given an InterpretedFunction (it has no cache key, and autotuning the interpreter is a little meaningless)
* verify that WarpGroupDotOp's result encoding is always NVMMA Hopper encoding * clean up some code with this * teach FenceInsertion to look through WarpSpecializeOp * deduplicate fences (e.g. two dots in a loop with captured reg->shared operands)
…(#6753) This implements a pass for converting tma load/store into legacy loads/stores. This is required for supporting tensor descriptors on hardware that doesn't directly support tensor descriptors. This does not implement: * Host side tensor descriptors - I'll submit this in a follow up PR. * Descriptor reduction operations. * Interop for unsupported tensor descriptors on devices which support tensor descriptors. This updates the (old) CUDA and HIP lowering to use this new pass. Lit tests have been added for the pass and the CUDA tensor descriptor tests that work on hardware have been move to the language folder since they are now supported on other hardware. The HIP lowering is untested as I don't have access to a AMD card. I have tested the CUDA lowering on an A100 machine.
anmyachev
approved these changes
May 15, 2025
dd293da
to
445e97a
Compare
Signed-off-by: Whitney Tsang <[email protected]>
445e97a
to
7fa8493
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR change the Triton base from 86e7117 to 99b5e29 (May 13).
Pass rate: 97.77%->97.25% (#4221, #4222)
Please do not squash and merge this PR.