Skip to content

Provide an 'out' parameter for numpy.fft.fft #25399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

serge-sans-paille
Copy link
Contributor

As the first parameter is always copied to the output, it doesn't have much impact performance wise.

It is useful, however, for those who need fine-grain control over memory allocation and cannot afford the cost of a temporary allocation.

@serge-sans-paille
Copy link
Contributor Author

serge-sans-paille commented Dec 15, 2023

Note: this is just to test the waters. In case of positive feedback, I'll provide the same parameter for other numpy.fft.* functions.

@serge-sans-paille serge-sans-paille force-pushed the feature/out-parameter-fft branch from 0b6d5be to 73a1d1c Compare December 15, 2023 07:33
@serge-sans-paille
Copy link
Contributor Author

cc @stefanv , but really that could be anyone :-)

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be very useful to have an out argument! However, one needs to take care that it has the right dtype and shape - see in-line comments.

p.s. Looking at the actual code, I'm somewhat surprised it is does not use the iterator, since then that kind of stuff could be dealt with by it (as well as possibly the axis). Indeed, the fft routines would seem easily implemented as a gufunc. Though that may be better done as follow-up!

else:
a = swapaxes(a, axis, -1)
r = pfi.execute(a, is_real, is_forward, fct)
r = pfi.execute(a, is_real, is_forward, fct, out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be risky - here, the axes of a have been altered but those of out have not. For complex-to-complex, it is possible to copy beforehand, though generally I think it is better to swap the axis for out as well - for real-to-complex, this will be required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to do exactly the same operation on out as you do on a. .resize definitely is not right as that can allocate new memory.

The only thing that would seem safe is the following:

if out is not None:
    out_swapped = swapaxes(out, axis, -1)
    pfi.execute(a, is_real, is_forward, fct, out_swapped)
    return out

if (!data) return NULL;
}
else {
data = (PyArrayObject*)PyArray_EnsureArray(out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this branch, one needs to be sure the dtype and shape are both correct. Does PyArray_CopyObject take care of that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding, yes.

@serge-sans-paille serge-sans-paille force-pushed the feature/out-parameter-fft branch 3 times, most recently from 37f057e to b11f2ae Compare December 17, 2023 09:00
@serge-sans-paille
Copy link
Contributor Author

serge-sans-paille commented Dec 17, 2023

++ extra test cases

@serge-sans-paille
Copy link
Contributor Author

(the macos issue seems unrelated)

@serge-sans-paille
Copy link
Contributor Author

@mhvk gentle ping :-)

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear the code you have is wrong: CopyObject converts data types, so, e.g., someone could have passed in a float32 array as out, to which the input gets copied correctly, but the data is interpreted incorrectly in the calculation below, where it is assumed to be float64.

It also looks like below the data array is assumed to be in C order, which does not have to be the case for an arbitrary out array.

My own sense is that one should write this as a gufunc so that iteration and input/output is done correctly automatically.

@mhvk
Copy link
Contributor

mhvk commented Dec 19, 2023

I also had a quick look at scipy, which provides PocketFFT as well (but as c++). It has an overwrite_x argument instead of an out one. I like out better, but deviating further from scipy is perhaps not ideal, not sure.

@serge-sans-paille serge-sans-paille force-pushed the feature/out-parameter-fft branch from b11f2ae to 04f8293 Compare December 21, 2023 08:56
@serge-sans-paille
Copy link
Contributor Author

@mhvk : I've kept the 'out' name to prepare the (future) move to ufunc. I also added the proper checks before copying.

@serge-sans-paille serge-sans-paille force-pushed the feature/out-parameter-fft branch from 04f8293 to b67dcc7 Compare December 22, 2023 23:33
@serge-sans-paille
Copy link
Contributor Author

Extra checks and tests added, thanks @mhvk for the hints

@serge-sans-paille serge-sans-paille force-pushed the feature/out-parameter-fft branch from b67dcc7 to 1fd1ce4 Compare December 23, 2023 07:27
As the first parameter is always copied to the output, it doesn't have
much impact performance wise.

It is useful, however, for those who need fine-grain control over memory
allocation and cannot afford the cost of a temporary allocation.
@serge-sans-paille serge-sans-paille force-pushed the feature/out-parameter-fft branch from 1fd1ce4 to 6a17489 Compare December 25, 2023 21:56
@serge-sans-paille
Copy link
Contributor Author

@mhvk looks good now?

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you'll see from the in-line comments, I still think you have a problem here... The main problem is that in every other case in numpy, out will be used to store the result only, with no change to shape or strides, but here for all but axis=-1, you need to swap axes which means that if you pass in something C-contiguous, it will not be C-contiguous afterwards. Although it still would work if one is re-using a previous result, since that is swapped just the same way already. Since this is arguably one of the more important use cases, you could still just go for that (see in-line comment).

Otherwise, I think you are sort-of stuck actually rewriting the current simple loop using the iterator. Though at that point, writing it as a gufunc is almost certainly less work and it would avoid all problems... The one tricky thing there would be to precalculate the fft plan and pass that to the inner loop (via *data).

else:
a = swapaxes(a, axis, -1)
r = pfi.execute(a, is_real, is_forward, fct)
r = pfi.execute(a, is_real, is_forward, fct, out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to do exactly the same operation on out as you do on a. .resize definitely is not right as that can allocate new memory.

The only thing that would seem safe is the following:

if out is not None:
    out_swapped = swapaxes(out, axis, -1)
    pfi.execute(a, is_real, is_forward, fct, out_swapped)
    return out

# tests below only test the out parameter
y = random((30, 20)) + 1j*random((30, 20))

out = np.zeros_like(x, dtype=complex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pass in dtype=complex here?

y = random((30, 20)) + 1j*random((30, 20))

out = np.zeros_like(x, dtype=complex)
assert_allclose(fft1(x), np.fft.fft(x, out=out), atol=1e-6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all these tests, you need to check too that out is actually returned, i.e., have something like

result = np.fft.fft(x, out=out)
assert result is out
assert_array_equal(fft1(x), result)

I replaced also with assert_array_equal since, hopefuilly, fft code is reproducible on a given machine!

# This extra copy is unfortunately needed if we want `out`
# to retain its original shape while having the correct values.
copyto(out, r)
r = out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is contrary to the regular behaviour of out - you must ensure out stays the same object with the same memory layout. Note that in principle, that will be automatic -- if you swapped the axis above, then the swapped case is a view of the original out, so data will just be written in there. I.e., it should be possible to remove this stanza (as in the suggestion I gave above).

@mhvk
Copy link
Contributor

mhvk commented Jan 4, 2024

@serge-sans-paille - As it seemed hard to get it right without the iterator, I went ahead and tried calling pocketfft from ufuncs. See #25536. I hope to add your tests soon.

@mreineck
Copy link
Contributor

Just a small comment: pocketfft could even deal with the situation where the input array and out are the same array (i.e. pointing to the same memory, same shape and strides). This could be pretty useful in some situations, reducing memory consumption and avoiding copying, especially in multi-D transforms.

@mreineck
Copy link
Contributor

Sorry, my last comment was confusing, since I was thinking about scipy's variant of pocketfft, not numpy's.

Still, re-using the input array as output should be doable with not too much effort. Not sure whether this is worth it ... it probably depends on the long-term plans for numpy.fft and scipy.fft.

@mhvk
Copy link
Contributor

mhvk commented Jan 15, 2024

@mreineck - since the C version of pocketfft does the FT in-place, this PR with its out parameter will allow making use of that. I have been wondering whether we should not switch to the C++ version as well, mostly to be able to support float32, but that's for another PR!

@mhvk
Copy link
Contributor

mhvk commented Jan 15, 2024

@mreineck - sorry, I answered in a PR that makes things less obvious - this one is really superseded by #25536. In fact, let me close this one to avoid further confusion.

@mhvk mhvk closed this Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy