Skip to content

Commit 62f0650

Browse files
gh-95781: More strict format string checking in PyUnicode_FromFormatV() (GH-95784)
An unrecognized format character in PyUnicode_FromFormat() and PyUnicode_FromFormatV() now sets a SystemError. In previous versions it caused all the rest of the format string to be copied as-is to the result string, and any extra arguments discarded.
1 parent 63140b4 commit 62f0650

File tree

5 files changed

+35
-39
lines changed

5 files changed

+35
-39
lines changed

Doc/c-api/unicode.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -477,9 +477,6 @@ APIs:
477477
| | | :c:func:`PyObject_Repr`. |
478478
+-------------------+---------------------+----------------------------------+
479479
480-
An unrecognized format character causes all the rest of the format string to be
481-
copied as-is to the result string, and any extra arguments discarded.
482-
483480
.. note::
484481
The width formatter unit is number of characters rather than bytes.
485482
The precision formatter unit is number of bytes for ``"%s"`` and
@@ -500,6 +497,11 @@ APIs:
500497
Support width and precision formatter for ``"%s"``, ``"%A"``, ``"%U"``,
501498
``"%V"``, ``"%S"``, ``"%R"`` added.
502499
500+
.. versionchanged:: 3.12
501+
An unrecognized format character now sets a :exc:`SystemError`.
502+
In previous versions it caused all the rest of the format string to be
503+
copied as-is to the result string, and any extra arguments discarded.
504+
503505
504506
.. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs)
505507

Doc/whatsnew/3.12.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -469,6 +469,12 @@ Porting to Python 3.12
469469
:py:meth:`~class.__subclasses__` (using :c:func:`PyObject_CallMethod`,
470470
for example).
471471

472+
* An unrecognized format character in :c:func:`PyUnicode_FromFormat` and
473+
:c:func:`PyUnicode_FromFormatV` now sets a :exc:`SystemError`.
474+
In previous versions it caused all the rest of the format string to be
475+
copied as-is to the result string, and any extra arguments discarded.
476+
(Contributed by Serhiy Storchaka in :gh:`95781`.)
477+
472478

473479
Deprecated
474480
----------

Lib/test/test_unicode.py

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2641,8 +2641,6 @@ def check_format(expected, format, *args):
26412641
b'%c%c', c_int(0x10000), c_int(0x100000))
26422642

26432643
# test "%"
2644-
check_format('%',
2645-
b'%')
26462644
check_format('%',
26472645
b'%%')
26482646
check_format('%s',
@@ -2819,23 +2817,22 @@ def check_format(expected, format, *args):
28192817
check_format('repr=abc\ufffd',
28202818
b'repr=%V', None, b'abc\xff')
28212819

2822-
# not supported: copy the raw format string. these tests are just here
2823-
# to check for crashes and should not be considered as specifications
2824-
check_format('%s',
2825-
b'%1%s', b'abc')
2826-
check_format('%1abc',
2827-
b'%1abc')
2828-
check_format('%+i',
2829-
b'%+i', c_int(10))
2830-
check_format('%.%s',
2831-
b'%.%s', b'abc')
2832-
28332820
# Issue #33817: empty strings
28342821
check_format('',
28352822
b'')
28362823
check_format('',
28372824
b'%s', b'')
28382825

2826+
# check for crashes
2827+
for fmt in (b'%', b'%0', b'%01', b'%.', b'%.1',
2828+
b'%0%s', b'%1%s', b'%.%s', b'%.1%s', b'%1abc',
2829+
b'%l', b'%ll', b'%z', b'%ls', b'%lls', b'%zs'):
2830+
with self.subTest(fmt=fmt):
2831+
self.assertRaisesRegex(SystemError, 'invalid format string',
2832+
PyUnicode_FromFormat, fmt, b'abc')
2833+
self.assertRaisesRegex(SystemError, 'invalid format string',
2834+
PyUnicode_FromFormat, b'%+i', c_int(10))
2835+
28392836
# Test PyUnicode_AsWideChar()
28402837
@support.cpython_only
28412838
@unittest.skipIf(_testcapi is None, 'need _testcapi module')
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
An unrecognized format character in :c:func:`PyUnicode_FromFormat` and
2+
:c:func:`PyUnicode_FromFormatV` now sets a :exc:`SystemError`.
3+
In previous versions it caused all the rest of the format string to be
4+
copied as-is to the result string, and any extra arguments discarded.

Objects/unicodeobject.c

Lines changed: 10 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2355,6 +2355,13 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
23552355

23562356
p = f;
23572357
f++;
2358+
if (*f == '%') {
2359+
if (_PyUnicodeWriter_WriteCharInline(writer, '%') < 0)
2360+
return NULL;
2361+
f++;
2362+
return f;
2363+
}
2364+
23582365
zeropad = 0;
23592366
if (*f == '0') {
23602367
zeropad = 1;
@@ -2392,14 +2399,6 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
23922399
f++;
23932400
}
23942401
}
2395-
if (*f == '%') {
2396-
/* "%.3%s" => f points to "3" */
2397-
f--;
2398-
}
2399-
}
2400-
if (*f == '\0') {
2401-
/* bogus format "%.123" => go backward, f points to "3" */
2402-
f--;
24032402
}
24042403

24052404
/* Handle %ld, %lu, %lld and %llu. */
@@ -2423,7 +2422,7 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
24232422
++f;
24242423
}
24252424

2426-
if (f[1] == '\0')
2425+
if (f[0] != '\0' && f[1] == '\0')
24272426
writer->overallocate = 0;
24282427

24292428
switch (*f) {
@@ -2616,21 +2615,9 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
26162615
break;
26172616
}
26182617

2619-
case '%':
2620-
if (_PyUnicodeWriter_WriteCharInline(writer, '%') < 0)
2621-
return NULL;
2622-
break;
2623-
26242618
default:
2625-
/* if we stumble upon an unknown formatting code, copy the rest
2626-
of the format string to the output string. (we cannot just
2627-
skip the code, since there's no way to know what's in the
2628-
argument list) */
2629-
len = strlen(p);
2630-
if (_PyUnicodeWriter_WriteLatin1String(writer, p, len) == -1)
2631-
return NULL;
2632-
f = p+len;
2633-
return f;
2619+
PyErr_Format(PyExc_SystemError, "invalid format string: %s", p);
2620+
return NULL;
26342621
}
26352622

26362623
f++;

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy