-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[triton][tool] Add support for printing shared memory layouts in the triton-tensor-layout
tool
#4839
Conversation
…layout` CLI tool. Add several tests for shmem printing based on previously committed tests + new ones.
Looks nice! |
@@ -152,6 +152,12 @@ void dumpHWLayout(RankedTensorType tensorType); | |||
// Return a string representation of the layout of the tensor. | |||
std::string getLayoutStr(RankedTensorType tensorType, bool useHWPointOfView); | |||
|
|||
// Helper functions for getLayoutStr | |||
std::string getSharedLayoutStr(RankedTensorType tensorType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose to expose these two lower level APIs here since the getLayoutStr
would dispatch to one of them. Isn't getLayoutStr
enough to capture the full functionality of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right; we probably don't need to export these
fixed
auto sharedLayout = shared(/* vec */ 1, /* perPhase */ 1, /* maxPhase */ 4, | ||
/* hasLeadingOffset */ false, | ||
/* CTALayout, single CTA layout */ | ||
{1}, /* cpg */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: this comment is a little confusing since you mixed two styles: for the first half, you put comment before the argument; and for the second half you put it after the argument. Better to make them consistent.
} | ||
} | ||
} else { | ||
// For the HW view here, print the (block, offset) --> (r,c) mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a deep understanding of linear layout, but from my superficial understanding, it can represent registers -> tensor element, or shared memory -> tensor element. This seems only print the layout from a shared memory perspective? How about the register one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only executed in the getSharedLayoutStr function, so it can only represent shared memory in this case; the registers perspective is at line 3353
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, just see your reply. Got it, thanks! The register view is for the distributed encoding thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a nice utility that extends the capability of the layout printing tool! Although I don't have a deep knowledge of linear layouts, the heavy-lifting thing seems all happened inside ll->apply()
. This implementation looks pretty nice to me, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
The
triton-tensor-layout
CLI tool currently only supports printing distributed layouts, although in the past there was support for printing shared memory layouts as well previous commit here. This PR adds support for printing shared memory layouts again (using linear layouts).I've added several unit tests based on the old tests to ensure correctness.
Example usage:
================================================================
The core Triton is a small number of people, and we receive many PRs (thank
you!). To help us review your code more quickly, if you are a new
contributor (less than 3 PRs merged) we ask that you complete the following
tasks and include the filled-out checklist in your PR description.
Complete the following tasks before sending your PR, and replace
[ ]
with[x]
to indicate you have done them.I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run
pre-commit run --from-ref origin/main --to-ref HEAD
.Select one of the following.
/test
forlit
tests/unittest
for C++ tests/python/test
for end-to-end testsFILL THIS IN
.Select one of the following.
lit
tests.lit
tests I have added follow these best practices,including the "tests should be minimal" section. (Usually running Python code
and using the instructions it generates is not minimal.)