Skip to content

Feat: Adding Linear Algebra Dot operation support #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 50 commits into from
Jul 22, 2025

Conversation

SwayamInSync
Copy link
Member

This PR contributes as follows:

  • Ship the dot method within package that supports following operations
    • vector-vector dot product
    • matrix-vector dot product
    • matrix-matrix multiplication
  • Optimized Linear Algebra ops supported by the QBLAS on x86-64 Linux and ARM machine. On windows it fallbacks to naive implementation due to QBLAS incompatibility with MSVC
  • Test Suite to validate dot products between inputs

Images below are the performance comparison

quadblas-x86-64-96-cores

Machine: x86-64 Linux with 96 cores

quadblas-ARM-8-core

Machine: MacOS-Silicon (ARM) with 8 cores

To compile without QBLAS set DISABLE_QUADBLAS as CFLAGS and CXXFLAGS

@SwayamInSync SwayamInSync requested a review from ngoldbaum July 11, 2025 08:19
@SwayamInSync
Copy link
Member Author

Ahh forgot to update the general CI, or should we remove the quaddtype from there, given that build_wheels.yml has those same checks in its CI with more strict and on all platforms

@SwayamInSync SwayamInSync self-assigned this Jul 11, 2025
@juntyr
Copy link
Contributor

juntyr commented Jul 11, 2025

Would the new functionality only be accessed through the dot function or is there a way to call that automatically when using numpy dot, matmul, etc?

@ngoldbaum
Copy link
Member

Unfortunately no, not easily. We'd need to add a new dtype hook to the DType API in NumPy. Worth doing though! See numpy/numpy#28516 which adds a hook for sorting.

@SwayamInSync
Copy link
Member Author

As per @seberg 's suggestion, this now implements numpy.matmul ufunc.

Some extra stuff includes:

  • Refactor of umath
  • Added a release_tracker.md to track the progress (pinned all the available numpy ufuncs and what we support right now)

@ngoldbaum please take a look and let me know if anything needs a fix

@ngoldbaum
Copy link
Member

This is a lot of code! I'll try to give this a once-over but I'm probably not going to have the bandwidth to go over this with a fine-toothed comb. @juntyr I'd appreciate it if you could also do a round of code review. Don't be afraid to ask questions if anything is confusing to you.

@SwayamInSync
Copy link
Member Author

This is a lot of code! I'll try to give this a once-over but I'm probably not going to have the bandwidth to go over this with a fine-toothed comb. @juntyr I'd appreciate it if you could also do a round of code review. Don't be afraid to ask questions if anything is confusing to you.

Yeah the refactor of umath made it big (but those codes are already reviewed), here is what's new

  • umath/matmul.cpp & .h files
  • quadblas_interface.cpp and h files
  • tests/test_dot.py
  • GitHub CI files

return NULL;
}

QuadBLAS::set_num_threads(num_threads);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also maybe check for a setting from threadpoolctl: https://github.com/joblib/threadpoolctl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, yes so by default (during the importing of package) it uses all the available cores but I also expose a python function which directly allow users to change this to whatever they want at any time.

I can add the functionality to just pick similar to numpy using threadpoolctl

@SwayamInSync
Copy link
Member Author

Thanks @ngoldbaum will apply all the changes in next commit :) and so sorry this one got hectic

@SwayamInSync SwayamInSync requested a review from ngoldbaum July 22, 2025 17:28
Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment - go ahead and self-merge!

I didn't test the windows instructions but we can always fix issues later and I think what's in here is a significant improvement over the nothing we had before :)

@SwayamInSync
Copy link
Member Author

I didn't test the windows instructions but we can always fix issues later and I think what's in here is a significant improvement over the nothing we had before :)

I wrote these parallel to windows CI and tested them on windows-11 PC
Remaining all ufuncs (from tracker) are easily doable @juntyr tackling some, so I'll most probably start the branch for auto documentation and support for high-endianess systems

@SwayamInSync SwayamInSync merged commit f043f8d into numpy:main Jul 22, 2025
6 checks passed
@SwayamInSync SwayamInSync deleted the dot branch July 22, 2025 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy