-
Notifications
You must be signed in to change notification settings - Fork 142
TensileLite
KKyang edited this page Jun 20, 2025
·
1 revision
TensileLite is a GEMM assembly generator in hipBLASLt.
This is a basic config yaml to generate an assembly kernel.
GlobalParameters:
SleepPercent: 50
Device: 0
MaxFileName: 256
DataInitTypeAlpha: 1
DataInitTypeBeta: 0
PrintSolutionRejectionReason: True
NumElementsToValidate: -1
NumWarmups: 0
EnqueuesPerSync: 1
SyncsPerBenchmark: 1
NumBenchmarks: 1
MinKForGSU: 1
KeepBuildTmp: 1
BenchmarkProblems:
#######################################
# NN - standard
#######################################
-
- # ProblemType
OperationType: GEMM
DataType: H
DestDataType: H
ComputeDataType: S
HighPrecisionAccumulate: True
TransposeA: 1
TransposeB: 0
UseBeta: True
Sparse: 0
Batched: True
Activation: True
ActivationType: all
- # BenchmarkProblemSizeGroup - Standard
InitialSolutionParameters:
BenchmarkCommonParameters:
- KernelLanguage: ["Assembly"]
ForkParameters:
- MatrixInstruction:
- [16, 16, 16, 1, 1, 2, 2, 2, 2]
- GlobalReadVectorWidthA: [4]
- WorkGroupMappingXCC: [8]
- DepthU: [32]
- TransposeLDS: [-1]
- ScheduleIterAlg: [3]
- PrefetchLocalRead: [1]
- PrefetchGlobalRead: [1]
- GlobalSplitU: [1, 2]
- 1LDSBuffer: [1]
- DirectToVgprSparseMetadata: [0]
- WorkGroupMapping: [1]
- StoreVectorWidth: [-1]
- UseSgprForGRO: [1]
BenchmarkForkParameters:
JoinParameters:
BenchmarkJoinParameters:
BenchmarkFinalParameters:
- ProblemSizes:
- Exact: [512, 512, 1, 512]
- ActivationArgs:
- [Enum: none]
Follow the README.md here to run cmake
first.
After running the cmake
, navigate to the build
folder and make a yaml file with the content above.
cd build
Tensile.sh <path to the yaml file> .
Then you'll see these folders inside the build
folder.
The assembly file is located inside the assembly
folder.