Skip to content

ENH: AXV512 SIMD optimizations for float power fast paths #28248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
ENH: Add SIMD equivalents for all float power fast paths
  • Loading branch information
MaanasArora committed Jan 29, 2025
commit ceb183968b547ff34c4a17593f9c0ab7157c76b7
27 changes: 23 additions & 4 deletions numpy/_core/src/umath/loops_umath_fp.dispatch.c.src
Original file line number Diff line number Diff line change
Expand Up @@ -249,11 +249,30 @@ NPY_NO_EXPORT void NPY_CPU_DISPATCH_CURFX(@TYPE@_@func@)
npyv_loadable_stride_@sfx@(steps[0]) &&
npyv_storable_stride_@sfx@(steps[2])
) {
const npy_intp ssrc = steps[0] / sizeof(@type@);
const npy_intp sdst = steps[2] / sizeof(@type@);
npyv_@sfx@ srcv = npyv_load_till_@sfx@(src, dimensions[0], 0);
npyv_@sfx@ dstv = npyv_load_till_@sfx@(dst, dimensions[0], 0);

if (*exp == 2) {
simd_exp2_@sfx@(src, ssrc, dst, sdst, dimensions[0]);
if (*exp == -1.0) {
dstv = npyv_recip_@sfx@(srcv);
}
else if (*exp == 0.0) {
dstv = npyv_setall_@sfx@(1.0);
}
else if (*exp == 0.5) {
dstv = npyv_sqrt_@sfx@(srcv);
}
else if (*exp == 1.0) {
dstv = srcv;
}
else if (*exp == 2.0) {
dstv = npyv_square_@sfx@(srcv);
}
else {
fastop_found = 0;
}

if (fastop_found) {
npyv_store_till_@sfx@(dst, dimensions[0], dstv);
return;
}
}
Expand Down
Loading
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy