Skip to content

FIX: np.insert fails with datetime64 and string input combination #29339 #29408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

AnkitAhlawat7742
Copy link

What was the Issue

Issue link: #29339
Bug Description: numpy.insert raised a RuntimeError when inserting a zero-dimensional np.datetime64 wrapped inside a zero-dimensional array into a string scalar or string array. This behavior is inconsistent because similar operations with datetime64 scalars work correctly, and the inserted values are expected to be converted to the string type of the destination array.

What This PR Fixes

This pull request fixes the issue for the case when inserting 0-dimensional np.datetime64 arrays into string scalars. It converts such 0-dim datetime64 arrays to their string representation before insertion, preventing the RuntimeError.

Notes

The fix ensures consistent behavior between inserting datetime64 scalars and 0-dim arrays containing datetime64.
This fix applies specifically to insertions into string-type arrays or scalars.
Additional tests have been added to cover these cases.

…in np.insert (numpy#29339)

- Fix np.insert to correctly convert 0-dim np.datetime64 arrays to strings when inserting into string arrays.
 This resolves RuntimeError when inserting datetime64 wrapped in 0-d arrays.
@ngoldbaum
Copy link
Member

The correct place to fix this is almost certainly a lower-level C function rather than adding a special case to the high-level wrapper.

The RuntimeError is coming from NpyDatetime_MakeISO8601Datetime, I'd start by looking at how that function is being called in a C debugger to understand where the bug is coming from.

@seberg
Copy link
Member

seberg commented Jul 23, 2025

I would think that string casts just behave different from item coercion. FWIW, I am not sure I am convinced we should even make this work. Is it really useful to allow such unclear conversions to begin with?

@ngoldbaum ngoldbaum added the 57 - Close? Issues which may be closable unless discussion continued label Jul 23, 2025
@melissawm melissawm moved this to Pending authors' response in NumPy first-time contributor PRs Jul 23, 2025
@ngoldbaum
Copy link
Member

FWIW, I am not sure I am convinced we should even make this work. Is it really useful to allow such unclear conversions to begin with?

Apparently I made it work when I added support for StringDType. The issue has a test script that illustrates this.

It'd be nice if there wasn't this weird inconsistency between StringDType and the fixed-width string dtypes.

@seberg
Copy link
Member

seberg commented Jul 23, 2025

It'd be nice if there wasn't this weird inconsistency between StringDType and the fixed-width string dtypes.

I am not sure I am following. I thought what is triggering is that one path for the old string dtype fails because the date can't be correctly inserted (it gets truncated), while the scalar assignment doesn't.
(Probably, because the scalar assignment does the exactly what this does.)

But, I am not sure why that matters for StringDType. StringDType doesn't have a length limitation, so it always succeeds and that is OK.
I.e. the non-stringDType code paths behave exactly the same in NumPy 1.24.

@ngoldbaum
Copy link
Member

I am not sure I am following. I thought what is triggering is that one path for the old string dtype fails because the date can't be correctly inserted (it gets truncated), while the scalar assignment doesn't.
(Probably, because the scalar assignment does the exactly what this does.)

Ah, that makes more sense, I didn't understand the subtlety that there was a truncation issue going on.

I agree then, there might not be anything to fix.

@AnkitAhlawat7742
Copy link
Author

Hi @ngoldbaum and @seberg,

Thank you for your feedback, and I apologize if my previous changes introduced any ambiguity. Let me clarify the core issue:

  • There is no problem with StringDType (works as expected).
  • The issue occurs specifically when inserting a zero-dimensional np.datetime64 array (wrapped in a 0D array) into a string scalar.However the same operation works fine with an unwrapped np.datetime64 scalar.

Although with unwrapped np.datetime64, the result is problematic due to truncation:

>>> np.insert('5', 0, np.datetime64('2025-10-10'))
output we got :-   array(['2', '5'], dtype='<U1')

My initial approach was to make 0D arrays behave like scalars (to fix the inconsistency), but I understand the concern about whether this should be handled at the C level or avoided entirely.

Is this logic acceptable as it is, or would it be better to move it to the C layer (e.g., NpyDatetime_MakeISO8601Datetime)? Or is this behavior inherently undesirable?

Please let me know If we do fix it, should we also address the truncation issue, or deprecate such conversions altogether?

@Kairoven
Copy link

I am not sure I am following. I thought what is triggering is that one path for the old string dtype fails because the date can't be correctly inserted (it gets truncated), while the scalar assignment doesn't.
(Probably, because the scalar assignment does the exactly what this does.)

But, I am not sure why that matters for StringDType. StringDType doesn't have a length limitation, so it always succeeds and that is OK.
I.e. the non-stringDType code paths behave exactly the same in NumPy 1.24.

Yes, exactly — since StringDType is not fixed-width, it’s expected that it doesn’t raise any issues here. The problem seems to lie in how truncation happens when using the str dtype.

As @AnkitAhlawat7742 pointed out, the issue occurs specifically when inserting a 0D np.datetime64 array into a string scalar — this fails, while inserting an unwrapped np.datetime64 scalar works but leads to silent truncation (e.g., array(['2', '5'], dtype='<U1')).
It would be great if this could either be fixed or clarified as expected behavior.

@seberg
Copy link
Member

seberg commented Jul 24, 2025

My point is that I generally don't like the truncating behavior much and I don't think it serves anyone making it work more often. The error seems usually like the better thing. This is a rare code path, data is lost, so forcing users to jump through hoops and casting themselves is good.

I would be fine with deprecating the scalar assignment to strings here. While it is a discussion that we should probably have, it is a bigger discussion probably, because it is what we do with most scalar assignments to strings right now.
However, I would be happy to entertain a PR that deprecates string truncation on scalar assignment.
(Another subtlety is that string->string casts also truncates silently, although that can be prevented with "safe" casting.)

Mainly, I am not convinced about moving towards being more forgiving here when being forgiving looks like a bug magnet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
57 - Close? Issues which may be closable unless discussion continued
Projects
Status: Pending authors' response
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy