Skip to content

Rework KTN gtest #3733

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
Open

Rework KTN gtest #3733

wants to merge 12 commits into from

Conversation

cderb
Copy link
Contributor

@cderb cderb commented May 14, 2025

Test no longer compares to hard coded string.
Test applicability of performance config returned by model.
Test solver can be used for shape.
Test invocation of solution.
#3708

auto out_tensor =
GPUMem{0, output_desc.GetNumBytes() / sizeof(data_type), sizeof(data_type)};
auto workSpaceSize = conv_desc.GetWorkSpaceSize(ctx, problem);
auto workSpace = GPUMem{0, workSpaceSize / sizeof(data_type), sizeof(data_type)};
Copy link
Contributor

@reidkwja reidkwja May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of splitting bytes into elements, some solvers require workspace to be aligned like ConvAsmImplicitGemmGTCDynamicWrwXdlopsNHWC (256 B padding) - or have a multiple of some tile size.

So it might be a good idea to assert it meets alignment:

assert((workSpaceSize % required_alignment) == 0);

@@ -181,11 +160,76 @@ class KernelTuningNetTest : public ::testing::TestWithParam<KernelTuningNetTestC
conv_desc,
test_case.direction);

Solver perf_config;
auto data_type = test_case.data_type;
auto in_tensor =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Becasue data_type is enum (not C++ type), when dividing the total byte_size, it will always be the size of that enum (which is 4 bytes) rather than the size of the underlying data elements intended to be allocated - which could over/under allocate GPU buffer. Is there a better way instead, like an actual element size or so?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a utility for this I believe:

inline std::size_t get_data_size(miopenDataType_t type)
{
auto ret = std::size_t{};
visit_float(type, [&](auto as_float) { ret = sizeof(decltype(as_float(1.f))); });
return ret;
}
inline std::size_t get_data_size(miopenIndexType_t index_type)
{
switch(index_type)
{
case miopenIndexUint8: {
return sizeof(uint8_t);
}
case miopenIndexUint16: {
return sizeof(uint16_t);
}
case miopenIndexUint32: {
return sizeof(uint32_t);
}
case miopenIndexUint64: {
return sizeof(uint64_t);
}
}
MIOPEN_THROW("not belong to any case");
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been resolved I believe.

@BrianHarrisonAMD
Copy link
Contributor

Looks like build is failing on windows.

lld-link: error: undefined symbol: bool __cdecl miopen::conv::IsAlgorithmDisabled(enum miopenConvAlgorithm_t)
--
  | 30811 | 01:18:24 AM | >>> referenced by C:\w\test\gtest\kernel_tuning_net.cpp:216
  | 30812 | 01:18:24 AM | >>>               CMakeFiles/test_kernel_tuning_net.dir/kernel_tuning_net.cpp.obj:(protected: void __cdecl KernelTuningNetTest<struct miopen::solver::conv::PerformanceConfigConvAsm1x1U>::TestParameterPredictionModel(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>))
  | 30813 | 01:18:24 AM | >>> referenced by C:\w\test\gtest\kernel_tuning_net.cpp:216
more_vert | 30814 | 01:18:24 AM | >>>               CMakeFiles/test_kernel_tuning_net.dir/kernel_tuning_net.cpp.obj:(protected: void __cdecl KernelTuningNetTest<struct miopen::solver::conv::PerformanceConfigHipImplicitGemmGroupFwdXdlops>::TestParameterPredictionModel(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>))
  | 30815 | 01:18:24 AM | >>> referenced by C:\w\test\gtest\kernel_tuning_net.cpp:216
  | 30816 | 01:18:24 AM | >>>               CMakeFiles/test_kernel_tuning_net.dir/kernel_tuning_net.cpp.obj:(protected: void __cdecl KernelTuningNetTest<struct miopen::solver::conv::PerformanceConfigHipImplicitGemmGroupBwdXdlops>::TestParameterPredictionModel(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>))
  | 30817 | 01:18:24 AM | >>> referenced 1 more times
  | 30818 | 01:18:24 AM |  
  | 30819 | 01:18:24 AM | lld-link: error: undefined symbol: bool __cdecl miopen::conv::IsEnoughWorkspace(class std::basic_string_view<char, struct std::char_traits<char>>, struct miopen::solver::Id const &, unsigned __int64, struct miopen::AnyInvokeParams const *)
  | 30820 | 01:18:24 AM | >>> referenced by C:\w\test\gtest\kernel_tuning_net.cpp:220
  | 30821 | 01:18:24 AM | >>>               CMakeFiles/test_kernel_tuning_net.dir/kernel_tuning_net.cpp.obj:(protected: void __cdecl KernelTuningNetTest<struct miopen::solver::conv::PerformanceConfigConvAsm1x1U>::TestParameterPredictionModel(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>))
  | 30822 | 01:18:24 AM | >>> referenced by C:\w\test\gtest\kernel_tuning_net.cpp:220
  | 30823 | 01:18:24 AM | >>>               CMakeFiles/test_kernel_tuning_net.dir/kernel_tuning_net.cpp.obj:(protected: void __cdecl KernelTuningNetTest<struct miopen::solver::conv::PerformanceConfigHipImplicitGemmGroupFwdXdlops>::TestParameterPredictionModel(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>))
  | 30824 | 01:18:24 AM | >>> referenced by C:\w\test\gtest\kernel_tuning_net.cpp:220
  | 30825 | 01:18:24 AM | >>>               CMakeFiles/test_kernel_tuning_net.dir/kernel_tuning_net.cpp.obj:(protected: void __cdecl KernelTuningNetTest<struct miopen::solver::conv::PerformanceConfigHipImplicitGemmGroupBwdXdlops>::TestParameterPredictionModel(class std::basic_string<char, struct std::char_traits<char>, class std::allocator<char>>))
  | 30826 | 01:18:24 AM | >>> referenced 1 more times
  | 30827 | 01:18:24 AM | clang++: error: linker command failed with exit code 1 (use -v to see invocation)

Probably need to add MIOPEN_INTERNALS_EXPORT to some of the functions that are used now in the tests.

@cderb
Copy link
Contributor Author

cderb commented May 30, 2025

Looks like build is failing on windows.

Probably need to add MIOPEN_INTERNALS_EXPORT to some of the functions that are used now in the tests.

I think this has been corrected.

Copy link
Contributor

@BrianHarrisonAMD BrianHarrisonAMD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit:

Might be worth getting @msaudulhassan to look at this.
I think the original intent of these tests were to ensure the trained model was matching the expected result once integrated.

We may want to have both types of tests.
Some specific points to ensure the model integration is matching expected predictions.
And another set that ensures a broad set of shapes is giving correct predictions.

@BrianHarrisonAMD BrianHarrisonAMD self-requested a review June 6, 2025 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants