Skip to content

ENH: Add support for flat indexing on flat iterator #27343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 29, 2025

Conversation

lysnikolaou
Copy link
Member

Closes #27342.

Obviously a bit of discussion might be needed before this gets accepted. I'm opening the PR as a means to show that the implementation only requires a minimal amount of changes.

@lysnikolaou lysnikolaou force-pushed the flatindex-on-flatiter branch from 8fd41d0 to 76b1b65 Compare September 4, 2024 14:38
@jorenham
Copy link
Member

jorenham commented Sep 4, 2024

For static type-checking purposes it would help if the typing stubs would also reflect this change

@seberg
Copy link
Member

seberg commented Sep 5, 2024

I think this is fine, but it should work by converting to an array via normal asarray() logic? I.e. flat defines __array__().

@jorenham
Copy link
Member

jorenham commented Sep 5, 2024

@seberg The stubs require it to be a integer Sequence or an integer ndarray at the moment (for multi-indexing), i.e. SupportsArray isn't included 🤔

@seberg
Copy link
Member

seberg commented Sep 5, 2024

I am saying that we shouldn't implement arr.flat speficically, but if we do that implement any integer array-like just like normal indexing.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense to solve this!

goto fail;
}

if (PyArray_SIZE(tmp_arr) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to explicitly test this branch. Should it run only if obj was not already an ndarray?

@ngoldbaum
Copy link
Member

I don't have a problem with the extra PyArrayIter_Check in the fast path, that should just be a pointer comparison in the common case I think.

@lysnikolaou
Copy link
Member Author

lysnikolaou commented Sep 9, 2024

I am saying that we shouldn't implement arr.flat speficically, but if we do that implement any integer array-like just like normal indexing.

Does this mean that we should not do the if PyArrayIter_Check, and instead try to do PyArray_FROM_O when we fall through the rest of the cases?

@seberg
Copy link
Member

seberg commented Sep 10, 2024

Something like that. The best thing may be to just call the normal indexing preparation code to ensure we do the same thing. Not sure if that works, but I suspect it will.
It would probably mean we also allow arr.flat[0,], but that seems fine also?

@lysnikolaou
Copy link
Member Author

Something like that. The best thing may be to just call the normal indexing preparation code to ensure we do the same thing. Not sure if that works, but I suspect it will. It would probably mean we also allow arr.flat[0,], but that seems fine also?

Yeah, that seems like a more involved fix however, since the whole iter_subscript routine will have to be adjusted, right? I'll have a look at it tomorrow.

@seberg
Copy link
Member

seberg commented Sep 10, 2024

No, not a complete adjustment hopefully, you would call:

prepare_index(PyArrayObject *self, PyObject *index,
and then need to use the result of that. But after it, you don't have to deal with conversion anymore.

@lysnikolaou
Copy link
Member Author

No, not a complete adjustment hopefully, you would call:

prepare_index(PyArrayObject *self, PyObject *index,

and then need to use the result of that. But after it, you don't have to deal with conversion anymore.

Would we use prepare_index with self->ao where self is the array iter object? That seems wrong since it's more than possible for the array iter object and the underlying array to not have the same dimensionality.

@seberg
Copy link
Member

seberg commented Sep 11, 2024

Hmmm, True, so it might not actually work without changes. It may be that all it needs from the array is the number of dimensions, but not sure.

@lysnikolaou
Copy link
Member Author

I gave this a try, but I'm too unfamiliar with this part of the codebase so I didn't have much progress. Maybe it'd be okay for this to go in as-is (after I've added some more tests) and open a follow-up issue to rework iter_subscript in general?

@lysnikolaou
Copy link
Member Author

Friendly ping.

@ngoldbaum ngoldbaum added the triage review Issue/PR to be discussed at the next triage meeting label Nov 19, 2024
@seberg
Copy link
Member

seberg commented Nov 19, 2024

Well, I can see that re-using mapping.c is too much hassle/not worthwhile. But I still think that this can be more generic and just cast to array generally (i.e. be merged with the previous branch).
In prepare_index that happens here:

if (!PyArray_Check(obj)) {

The only reason not to do that, is that it would fix currently broken cases like:

  • arr.flat[[True, True]] (incorrectly behaves like arr.flat[[1, 1]])
  • arr.flat[[0.0, 1.0]] (should just fail)

Which means you can convince me that we don't need to merge it, but I think only if we allow doing it in the future (and deprecate/warn about those).
But it is likely just as easy to add the special path afterwards (i.e. if input was a list, warn and force cast).

@mattip
Copy link
Member

mattip commented Nov 27, 2024

We looked at this in a triage meeting and would like to see this handle more general array types as @seberg says just above. @lysnikolaou what do you think?

@mattip mattip removed the triage review Issue/PR to be discussed at the next triage meeting label Nov 27, 2024
@lysnikolaou
Copy link
Member Author

Thanks for the update @mattip and sorry for taking so long to reply here. I'll spend some time on this this week to hopefully get it over the finish line.

@lysnikolaou
Copy link
Member Author

lysnikolaou commented Jan 16, 2025

Hey folks! Sorry for the delay here, but I came back to this today and I just pushed a commit that I think is closer to what we want.

EDIT: Deleted a question that I didn't think through. Answer is clear to me now.

@lysnikolaou
Copy link
Member Author

Gentle ping on this, in case anyone has some spare cycles for a review.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve in the grand-scheme of things. I haven't quite thought about the non-integer madness here.
(In a sense, I am not sure I think the size == 0 check matters, since we already accept the empty list and that is what I would worry most about. But it also seems right, so...)

And yeah, merging with the above only works if we try to start fixing the fact that arr.flat[[False, True, False]] returns the wrong thing. Which we absolutely should do of course

@ngoldbaum
Copy link
Member

Thanks @lysnikolaou! Could you open the followup issue to track further improvements to flat when you have a chance?

@ngoldbaum ngoldbaum merged commit 760dbe9 into numpy:main Jan 29, 2025
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Allow for a flatiter index on a flatiter object
6 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy