Skip to content

BUG: memory leak on numpy.linalg.solve #27629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
HicaroD opened this issue Oct 23, 2024 · 5 comments
Open

BUG: memory leak on numpy.linalg.solve #27629

HicaroD opened this issue Oct 23, 2024 · 5 comments
Labels

Comments

@HicaroD
Copy link

HicaroD commented Oct 23, 2024

Describe the issue:

image

At my job, I was doing some memory analysis on our application in order to identify possible memory leaks.

To do this, I've used the package memray, which offers a lot of handy commands to identify memory leaks by generating a flame graphs that identifies memory chuncks allocated after tracking starts and not deallocated after tracking ends.

Reproduce the code example:

# test_numpy_leak.py
import numpy as np


def main():
    for _ in range(1_000_000):
        a = np.array([[1, 2], [3, 5]])
        b = np.array([1, 2])
        np.linalg.solve(a, b)


if __name__ == "__main__":
    main()

Error message:

In order to reproduce the analysis, use the following commands:

  • PYTHONMALLOC=malloc python -m memray run --native test_numpy_leak.py
  • python -m memray flamegraph --leaks memray-test_numpy_leak.py.XXXX.bin, where XXXX is the number of the file specified in its name.

After running these commands, it generates an HTML file, so you can open in your browser and see a flame graph, like the one above. I highlighted in red the section where memray reports a memory leak, if I hover it, it says that 4 allocations was made (32.0 MiB in total).

The line of code that is leaking memory is the following:

r = gufunc(a, b, signature=signature)

Python and NumPy Versions:

1.26.2
3.11.9 (main, Jul 19 2024, 18:26:46) [GCC 14.1.1 20240522]

Runtime Environment:

No response

Context for the issue:

Actually, I was just worried if this memory leak could scale and do more damage

@HicaroD
Copy link
Author

HicaroD commented Oct 23, 2024

If you guys think it is a problem, I would love to figure out how to solve this leak and contribute to this project.

@ngoldbaum
Copy link
Member

I wonder if there's a reference counting bug.

It would be interesting to compile a debug build of Python so you can check that sys.gettotalrefcount() isn't monotonically increasing as you loop over calling the gufunc over and over. If it is then it's likely there's a reference counting bug somewhere in NumPy's C internals.

@HicaroD
Copy link
Author

HicaroD commented Oct 23, 2024

I wonder if there's a reference counting bug.

It would be interesting to compile a debug build of Python so you can check that sys.gettotalrefcount() isn't monotonically increasing as you loop over calling the gufunc over and over. If it is then it's likely there's a reference counting bug somewhere in NumPy's C internals.

Great point, @ngoldbaum. I'll do it ASAP!

@seberg
Copy link
Member

seberg commented Oct 24, 2024

If such simple code had a reference counting bug, a lot bigger things would explode, I imagine.
For one, I imagine we have similar tests and we should have noticed a real leak.

I would suggest just checking the typical thing: run the same thing for much longer or shorter and see if the result and amount leaked is pretty much the same.

It seems more likely to me that some caches get initialized once and then re-used. (Either in NumPy or openblas.)

@Taytay
Copy link

Taytay commented Jun 2, 2025

I stumbled across this because I was seeing something similar using Memray on my Mac.
Upgrading to numpy 2.2.6 (from numpy 1.x) solved it

Image

Details for the curious:

If you call import numpy, it will in turn call mac_os_check:

def _mac_os_check():

That will run a little bit of math, which eventually calls polyfit, which eventually gets down to a gufunc, which seems to allocate 32MB of memory that doesn't get freed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy