-
Notifications
You must be signed in to change notification settings - Fork 797
[NATIVECPU] Emit Native CPU properties (correctness) #19429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
clang/lib/CodeGen/BackendUtil.cpp
Outdated
if (LangOpts.SYCLIsNativeCPU) | ||
llvm::sycl::utils::addSYCLNativeCPUEarlyPasses(MPM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed internally, ideally we'd get rid of both LangOpts.SYCLIsNativeCPU
and SYCLNativeCPUBackend
and check what target we're building for, but this is blocked on #19344, without that we don't have enough information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #19344 is now merged I've updated the PR and removed the usage of LangOpts.SYCLIsNativeCPU
.
@@ -149,7 +149,8 @@ UR_APIEXPORT ur_result_t UR_APICALL urEnqueueKernelLaunch( | |||
bool isLocalSizeOne = | |||
ndr.LocalSize[0] == 1 && ndr.LocalSize[1] == 1 && ndr.LocalSize[2] == 1; | |||
if (isLocalSizeOne && ndr.GlobalSize[0] > numParallelThreads && | |||
!kernel->hasLocalArgs()) { | |||
!kernel->hasLocalArgs() && !hKernel->isNDRangeKernel()) { | |||
// TODO: Check if !kernel->hasLocalArgs() is needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You made the logic say that if it has local args, it's automatically necessary an NDRangeKernel. I'm not sure yet whether that's 100% accurate, but if it is, hasLocalArgs()
does become redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add/modify test related to ClangLinkerWrapper
. It could be clang/test/Driver/linker-wrapper-image.c
.
auto FCalle = M.getOrInsertFunction( | ||
sycl::utils::addSYCLNativeCPUSuffix(Name).str(), FTy); | ||
Function *F = dyn_cast<Function>(FCalle.getCallee()); | ||
if (F == nullptr) | ||
report_fatal_error("Unexpected callee"); | ||
return F; | ||
} | ||
std::optional<util::PropertySet> SYCLNativeCPUPropSet = std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, make it simpler without a semiglobal variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made this a local variable that is passed into Wrapper::addPropertySetRegistry
instead. Would that be acceptable?
auto *NullPtr = llvm::ConstantPointerNull::get(PointerType::getUnqual(C)); | ||
if (Entries.empty()) | ||
return {NullPtr, NullPtr}; | ||
|
||
std::unique_ptr<MemoryBuffer> MB = MemoryBuffer::getMemBuffer(Entries); | ||
// the Native CPU PI Plug-in expects the BinaryStart field to point to an | ||
// array of struct nativecpu_entry { | ||
// the Native CPU UR adapter expects the BinaryStart field to point to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, move all important details to function's documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added documentation comments to the function definition (or do you mean documentation elsewhere?), but left the original comments describing the details in place. Would that be acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FE changes in BackendUtil.cpp look good to me.
Extends
clang-offload-wrapper
andSYCLOffloadWrapper
(clang-linker-wrapper
), enabling adding kernel properties that are specific only to Native CPU. Adds a compiler pass that checks whether a kernel comes from asycl::nd-range
and adds a Native CPU - only property for it.This PR fixes at least
test_handler
from the SYCL-CTS on NativeCPU by using the nd_range attribute in the NativeCPU adapter to only combine multiple work groups in invocations of non-nd_range kernels.This new "kernel property infrastructure" will be extended in the future to encode other kernel capabilities. For example the applied vector width which determines how many workgroups could be executed in one kernel invocation. This could help making the kernel launches more efficient. Another property could encode whether the kernel supports peeling - without peeling the kernel would have less branches and the NativeCPU adapter could schedule peeling invocations (of the scalar kernel) in separate threads which might benefit some performance scenario.
This PR replaces and extends #16152 which still used the old UR api/repo.
This PR also adds testing to ensure the new clang-linker-wrapper integration produces the expected IR on SYCL code.