From 273e15fee950a71842eae45173c3fec480b41f54 Mon Sep 17 00:00:00 2001 From: Adam Goldschmidt Date: Mon, 15 Feb 2021 00:41:57 +0200 Subject: [PATCH 01/11] bpo-42967: only use '&' as a query string separator (#24297) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Éric Araujo (cherry picked from commit fcbe0cb04d35189401c0c880ebfb4311e952d776) --- Doc/library/cgi.rst | 9 ++- Doc/library/urllib.parse.rst | 16 ++++- Doc/whatsnew/3.6.rst | 13 ++++ Doc/whatsnew/3.7.rst | 13 ++++ Doc/whatsnew/3.8.rst | 13 ++++ Doc/whatsnew/3.9.rst | 13 ++++ Lib/cgi.py | 23 ++++--- Lib/test/test_cgi.py | 29 ++++++-- Lib/test/test_urlparse.py | 68 +++++++++++++------ Lib/urllib/parse.py | 20 ++++-- .../2021-02-14-15-59-16.bpo-42967.YApqDS.rst | 1 + 11 files changed, 172 insertions(+), 46 deletions(-) create mode 100644 Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst diff --git a/Doc/library/cgi.rst b/Doc/library/cgi.rst index 4048592e7361f7..05d9cdf424073f 100644 --- a/Doc/library/cgi.rst +++ b/Doc/library/cgi.rst @@ -277,14 +277,14 @@ These are useful if you want more control, or if you want to employ some of the algorithms implemented in this module in other circumstances. -.. function:: parse(fp=None, environ=os.environ, keep_blank_values=False, strict_parsing=False) +.. function:: parse(fp=None, environ=os.environ, keep_blank_values=False, strict_parsing=False, separator="&") Parse a query in the environment or from a file (the file defaults to - ``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are + ``sys.stdin``). The *keep_blank_values*, *strict_parsing* and *separator* parameters are passed to :func:`urllib.parse.parse_qs` unchanged. -.. function:: parse_multipart(fp, pdict, encoding="utf-8", errors="replace") +.. function:: parse_multipart(fp, pdict, encoding="utf-8", errors="replace", separator="&") Parse input of type :mimetype:`multipart/form-data` (for file uploads). Arguments are *fp* for the input file, *pdict* for a dictionary containing @@ -303,6 +303,9 @@ algorithms implemented in this module in other circumstances. Added the *encoding* and *errors* parameters. For non-file fields, the value is now a list of strings, not bytes. + .. versionchanged:: 3.10 + Added the *separator* parameter. + .. function:: parse_header(string) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 536cf952bda434..0e520371320f9d 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -165,7 +165,7 @@ or on combining URL components into a URL string. now raise :exc:`ValueError`. -.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None) +.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&') Parse a query string given as a string argument (data of type :mimetype:`application/x-www-form-urlencoded`). Data are returned as a @@ -190,6 +190,8 @@ or on combining URL components into a URL string. read. If set, then throws a :exc:`ValueError` if there are more than *max_num_fields* fields read. + The optional argument *separator* is the symbol to use for separating the query arguments. It defaults to `&`. + Use the :func:`urllib.parse.urlencode` function (with the ``doseq`` parameter set to ``True``) to convert such dictionaries into query strings. @@ -201,8 +203,12 @@ or on combining URL components into a URL string. .. versionchanged:: 3.8 Added *max_num_fields* parameter. + .. versionchanged:: 3.10 + Added *separator* parameter with the default value of `&`. Python versions earlier than Python 3.10 allowed using both ";" and "&" as + query parameter separator. This has been changed to allow only a single separator key, with "&" as the default separator. + -.. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None) +.. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&') Parse a query string given as a string argument (data of type :mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of @@ -226,6 +232,8 @@ or on combining URL components into a URL string. read. If set, then throws a :exc:`ValueError` if there are more than *max_num_fields* fields read. + The optional argument *separator* is the symbol to use for separating the query arguments. It defaults to `&`. + Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into query strings. @@ -235,6 +243,10 @@ or on combining URL components into a URL string. .. versionchanged:: 3.8 Added *max_num_fields* parameter. + .. versionchanged:: 3.10 + Added *separator* parameter with the default value of `&`. Python versions earlier than Python 3.10 allowed using both ";" and "&" as + query parameter separator. This has been changed to allow only a single separator key, with "&" as the default separator. + .. function:: urlunparse(parts) diff --git a/Doc/whatsnew/3.6.rst b/Doc/whatsnew/3.6.rst index 85a6657fdfbdac..8a64da1b249d7d 100644 --- a/Doc/whatsnew/3.6.rst +++ b/Doc/whatsnew/3.6.rst @@ -2443,3 +2443,16 @@ because of the behavior of the socket option ``SO_REUSEADDR`` in UDP. For more details, see the documentation for ``loop.create_datagram_endpoint()``. (Contributed by Kyle Stanley, Antoine Pitrou, and Yury Selivanov in :issue:`37228`.) + +Notable changes in Python 3.6.13 +================================ + +Earlier Python versions allowed using both ";" and "&" as +query parameter separators in :func:`urllib.parse.parse_qs` and +:func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with +newer W3C recommendations, this has been changed to allow only a single +separator key, with "&" as the default. This change also affects +:func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected +functions internally. For more details, please see their respective +documentation. +(Contributed by Adam Goldschmidt, Senthil Kumaran and Ken Jin in :issue:`42967`.) diff --git a/Doc/whatsnew/3.7.rst b/Doc/whatsnew/3.7.rst index 25b1e1e33e325c..9204cc7fbf8c47 100644 --- a/Doc/whatsnew/3.7.rst +++ b/Doc/whatsnew/3.7.rst @@ -2556,3 +2556,16 @@ because of the behavior of the socket option ``SO_REUSEADDR`` in UDP. For more details, see the documentation for ``loop.create_datagram_endpoint()``. (Contributed by Kyle Stanley, Antoine Pitrou, and Yury Selivanov in :issue:`37228`.) + +Notable changes in Python 3.7.10 +================================ + +Earlier Python versions allowed using both ``;`` and ``&`` as +query parameter separators in :func:`urllib.parse.parse_qs` and +:func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with +newer W3C recommendations, this has been changed to allow only a single +separator key, with ``&`` as the default. This change also affects +:func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected +functions internally. For more details, please see their respective +documentation. +(Contributed by Adam Goldschmidt, Senthil Kumaran and Ken Jin in :issue:`42967`.) diff --git a/Doc/whatsnew/3.8.rst b/Doc/whatsnew/3.8.rst index 0b4820f3333e13..d21921d3dd51e7 100644 --- a/Doc/whatsnew/3.8.rst +++ b/Doc/whatsnew/3.8.rst @@ -2234,3 +2234,16 @@ because of the behavior of the socket option ``SO_REUSEADDR`` in UDP. For more details, see the documentation for ``loop.create_datagram_endpoint()``. (Contributed by Kyle Stanley, Antoine Pitrou, and Yury Selivanov in :issue:`37228`.) + +Notable changes in Python 3.8.8 +=============================== + +Earlier Python versions allowed using both ";" and "&" as +query parameter separators in :func:`urllib.parse.parse_qs` and +:func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with +newer W3C recommendations, this has been changed to allow only a single +separator key, with "&" as the default. This change also affects +:func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected +functions internally. For more details, please see their respective +documentation. +(Contributed by Adam Goldschmidt, Senthil Kumaran and Ken Jin in :issue:`42967`.) \ No newline at end of file diff --git a/Doc/whatsnew/3.9.rst b/Doc/whatsnew/3.9.rst index 68b1e504da89ef..5f4f8ba211b180 100644 --- a/Doc/whatsnew/3.9.rst +++ b/Doc/whatsnew/3.9.rst @@ -1516,3 +1516,16 @@ invalid forms of parameterizing :class:`collections.abc.Callable` which may have passed silently in Python 3.9.1. This :exc:`DeprecationWarning` will become a :exc:`TypeError` in Python 3.10. (Contributed by Ken Jin in :issue:`42195`.) + +urllib.parse +------------ + +Earlier Python versions allowed using both ";" and "&" as +query parameter separators in :func:`urllib.parse.parse_qs` and +:func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with +newer W3C recommendations, this has been changed to allow only a single +separator key, with "&" as the default. This change also affects +:func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected +functions internally. For more details, please see their respective +documentation. +(Contributed by Adam Goldschmidt, Senthil Kumaran and Ken Jin in :issue:`42967`.) diff --git a/Lib/cgi.py b/Lib/cgi.py index 77ab703cc03600..1e880e51848af2 100755 --- a/Lib/cgi.py +++ b/Lib/cgi.py @@ -115,7 +115,8 @@ def closelog(): # 0 ==> unlimited input maxlen = 0 -def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0): +def parse(fp=None, environ=os.environ, keep_blank_values=0, + strict_parsing=0, separator='&'): """Parse a query in the environment or from a file (default stdin) Arguments, all optional: @@ -134,6 +135,9 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0): strict_parsing: flag indicating what to do with parsing errors. If false (the default), errors are silently ignored. If true, errors raise a ValueError exception. + + separator: str. The symbol to use for separating the query arguments. + Defaults to &. """ if fp is None: fp = sys.stdin @@ -154,7 +158,7 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0): if environ['REQUEST_METHOD'] == 'POST': ctype, pdict = parse_header(environ['CONTENT_TYPE']) if ctype == 'multipart/form-data': - return parse_multipart(fp, pdict) + return parse_multipart(fp, pdict, separator=separator) elif ctype == 'application/x-www-form-urlencoded': clength = int(environ['CONTENT_LENGTH']) if maxlen and clength > maxlen: @@ -178,10 +182,10 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0): qs = "" environ['QUERY_STRING'] = qs # XXX Shouldn't, really return urllib.parse.parse_qs(qs, keep_blank_values, strict_parsing, - encoding=encoding) + encoding=encoding, separator=separator) -def parse_multipart(fp, pdict, encoding="utf-8", errors="replace"): +def parse_multipart(fp, pdict, encoding="utf-8", errors="replace", separator='&'): """Parse multipart input. Arguments: @@ -205,7 +209,7 @@ def parse_multipart(fp, pdict, encoding="utf-8", errors="replace"): except KeyError: pass fs = FieldStorage(fp, headers=headers, encoding=encoding, errors=errors, - environ={'REQUEST_METHOD': 'POST'}) + environ={'REQUEST_METHOD': 'POST'}, separator=separator) return {k: fs.getlist(k) for k in fs} def _parseparam(s): @@ -315,7 +319,7 @@ class FieldStorage: def __init__(self, fp=None, headers=None, outerboundary=b'', environ=os.environ, keep_blank_values=0, strict_parsing=0, limit=None, encoding='utf-8', errors='replace', - max_num_fields=None): + max_num_fields=None, separator='&'): """Constructor. Read multipart/* until last part. Arguments, all optional: @@ -363,6 +367,7 @@ def __init__(self, fp=None, headers=None, outerboundary=b'', self.keep_blank_values = keep_blank_values self.strict_parsing = strict_parsing self.max_num_fields = max_num_fields + self.separator = separator if 'REQUEST_METHOD' in environ: method = environ['REQUEST_METHOD'].upper() self.qs_on_post = None @@ -589,7 +594,7 @@ def read_urlencoded(self): query = urllib.parse.parse_qsl( qs, self.keep_blank_values, self.strict_parsing, encoding=self.encoding, errors=self.errors, - max_num_fields=self.max_num_fields) + max_num_fields=self.max_num_fields, separator=self.separator) self.list = [MiniFieldStorage(key, value) for key, value in query] self.skip_lines() @@ -605,7 +610,7 @@ def read_multi(self, environ, keep_blank_values, strict_parsing): query = urllib.parse.parse_qsl( self.qs_on_post, self.keep_blank_values, self.strict_parsing, encoding=self.encoding, errors=self.errors, - max_num_fields=self.max_num_fields) + max_num_fields=self.max_num_fields, separator=self.separator) self.list.extend(MiniFieldStorage(key, value) for key, value in query) klass = self.FieldStorageClass or self.__class__ @@ -649,7 +654,7 @@ def read_multi(self, environ, keep_blank_values, strict_parsing): else self.limit - self.bytes_read part = klass(self.fp, headers, ib, environ, keep_blank_values, strict_parsing, limit, - self.encoding, self.errors, max_num_fields) + self.encoding, self.errors, max_num_fields, self.separator) if max_num_fields is not None: max_num_fields -= 1 diff --git a/Lib/test/test_cgi.py b/Lib/test/test_cgi.py index 101942de947fb4..4e1506a6468b93 100644 --- a/Lib/test/test_cgi.py +++ b/Lib/test/test_cgi.py @@ -53,12 +53,9 @@ def do_test(buf, method): ("", ValueError("bad query field: ''")), ("&", ValueError("bad query field: ''")), ("&&", ValueError("bad query field: ''")), - (";", ValueError("bad query field: ''")), - (";&;", ValueError("bad query field: ''")), # Should the next few really be valid? ("=", {}), ("=&=", {}), - ("=;=", {}), # This rest seem to make sense ("=a", {'': ['a']}), ("&=a", ValueError("bad query field: ''")), @@ -73,8 +70,6 @@ def do_test(buf, method): ("a=a+b&b=b+c", {'a': ['a b'], 'b': ['b c']}), ("a=a+b&a=b+a", {'a': ['a b', 'b a']}), ("x=1&y=2.0&z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}), - ("x=1;y=2.0&z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}), - ("x=1;y=2.0;z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}), ("Hbc5161168c542333633315dee1182227:key_store_seqid=400006&cuyer=r&view=bustomer&order_id=0bb2e248638833d48cb7fed300000f1b&expire=964546263&lobale=en-US&kid=130003.300038&ss=env", {'Hbc5161168c542333633315dee1182227:key_store_seqid': ['400006'], 'cuyer': ['r'], @@ -201,6 +196,30 @@ def test_strict(self): else: self.assertEqual(fs.getvalue(key), expect_val[0]) + def test_separator(self): + parse_semicolon = [ + ("x=1;y=2.0", {'x': ['1'], 'y': ['2.0']}), + ("x=1;y=2.0;z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}), + (";", ValueError("bad query field: ''")), + (";;", ValueError("bad query field: ''")), + ("=;a", ValueError("bad query field: 'a'")), + (";b=a", ValueError("bad query field: ''")), + ("b;=a", ValueError("bad query field: 'b'")), + ("a=a+b;b=b+c", {'a': ['a b'], 'b': ['b c']}), + ("a=a+b;a=b+a", {'a': ['a b', 'b a']}), + ] + for orig, expect in parse_semicolon: + env = {'QUERY_STRING': orig} + fs = cgi.FieldStorage(separator=';', environ=env) + if isinstance(expect, dict): + for key in expect.keys(): + expect_val = expect[key] + self.assertIn(key, fs) + if len(expect_val) > 1: + self.assertEqual(fs.getvalue(key), expect_val) + else: + self.assertEqual(fs.getvalue(key), expect_val[0]) + def test_log(self): cgi.log("Testing") diff --git a/Lib/test/test_urlparse.py b/Lib/test/test_urlparse.py index 762500789f73ac..3b1c360625b5a6 100644 --- a/Lib/test/test_urlparse.py +++ b/Lib/test/test_urlparse.py @@ -32,16 +32,10 @@ (b"&a=b", [(b'a', b'b')]), (b"a=a+b&b=b+c", [(b'a', b'a b'), (b'b', b'b c')]), (b"a=1&a=2", [(b'a', b'1'), (b'a', b'2')]), - (";", []), - (";;", []), - (";a=b", [('a', 'b')]), - ("a=a+b;b=b+c", [('a', 'a b'), ('b', 'b c')]), - ("a=1;a=2", [('a', '1'), ('a', '2')]), - (b";", []), - (b";;", []), - (b";a=b", [(b'a', b'b')]), - (b"a=a+b;b=b+c", [(b'a', b'a b'), (b'b', b'b c')]), - (b"a=1;a=2", [(b'a', b'1'), (b'a', b'2')]), + (";a=b", [(';a', 'b')]), + ("a=a+b;b=b+c", [('a', 'a b;b=b c')]), + (b";a=b", [(b';a', b'b')]), + (b"a=a+b;b=b+c", [(b'a', b'a b;b=b c')]), ] # Each parse_qs testcase is a two-tuple that contains @@ -68,16 +62,10 @@ (b"&a=b", {b'a': [b'b']}), (b"a=a+b&b=b+c", {b'a': [b'a b'], b'b': [b'b c']}), (b"a=1&a=2", {b'a': [b'1', b'2']}), - (";", {}), - (";;", {}), - (";a=b", {'a': ['b']}), - ("a=a+b;b=b+c", {'a': ['a b'], 'b': ['b c']}), - ("a=1;a=2", {'a': ['1', '2']}), - (b";", {}), - (b";;", {}), - (b";a=b", {b'a': [b'b']}), - (b"a=a+b;b=b+c", {b'a': [b'a b'], b'b': [b'b c']}), - (b"a=1;a=2", {b'a': [b'1', b'2']}), + (";a=b", {';a': ['b']}), + ("a=a+b;b=b+c", {'a': ['a b;b=b c']}), + (b";a=b", {b';a': [b'b']}), + (b"a=a+b;b=b+c", {b'a':[ b'a b;b=b c']}), ] class UrlParseTestCase(unittest.TestCase): @@ -886,10 +874,46 @@ def test_parse_qsl_encoding(self): def test_parse_qsl_max_num_fields(self): with self.assertRaises(ValueError): urllib.parse.parse_qs('&'.join(['a=a']*11), max_num_fields=10) - with self.assertRaises(ValueError): - urllib.parse.parse_qs(';'.join(['a=a']*11), max_num_fields=10) urllib.parse.parse_qs('&'.join(['a=a']*10), max_num_fields=10) + def test_parse_qs_separator(self): + parse_qs_semicolon_cases = [ + (";", {}), + (";;", {}), + (";a=b", {'a': ['b']}), + ("a=a+b;b=b+c", {'a': ['a b'], 'b': ['b c']}), + ("a=1;a=2", {'a': ['1', '2']}), + (b";", {}), + (b";;", {}), + (b";a=b", {b'a': [b'b']}), + (b"a=a+b;b=b+c", {b'a': [b'a b'], b'b': [b'b c']}), + (b"a=1;a=2", {b'a': [b'1', b'2']}), + ] + for orig, expect in parse_qs_semicolon_cases: + with self.subTest(f"Original: {orig!r}, Expected: {expect!r}"): + result = urllib.parse.parse_qs(orig, separator=';') + self.assertEqual(result, expect, "Error parsing %r" % orig) + + + def test_parse_qsl_separator(self): + parse_qsl_semicolon_cases = [ + (";", []), + (";;", []), + (";a=b", [('a', 'b')]), + ("a=a+b;b=b+c", [('a', 'a b'), ('b', 'b c')]), + ("a=1;a=2", [('a', '1'), ('a', '2')]), + (b";", []), + (b";;", []), + (b";a=b", [(b'a', b'b')]), + (b"a=a+b;b=b+c", [(b'a', b'a b'), (b'b', b'b c')]), + (b"a=1;a=2", [(b'a', b'1'), (b'a', b'2')]), + ] + for orig, expect in parse_qsl_semicolon_cases: + with self.subTest(f"Original: {orig!r}, Expected: {expect!r}"): + result = urllib.parse.parse_qsl(orig, separator=';') + self.assertEqual(result, expect, "Error parsing %r" % orig) + + def test_urlencode_sequences(self): # Other tests incidentally urlencode things; test non-covered cases: # Sequence and object values. diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py index ea897c3032257b..5bd067895bfa3d 100644 --- a/Lib/urllib/parse.py +++ b/Lib/urllib/parse.py @@ -662,7 +662,7 @@ def unquote(string, encoding='utf-8', errors='replace'): def parse_qs(qs, keep_blank_values=False, strict_parsing=False, - encoding='utf-8', errors='replace', max_num_fields=None): + encoding='utf-8', errors='replace', max_num_fields=None, separator='&'): """Parse a query given as a string argument. Arguments: @@ -686,12 +686,15 @@ def parse_qs(qs, keep_blank_values=False, strict_parsing=False, max_num_fields: int. If set, then throws a ValueError if there are more than n fields read by parse_qsl(). + separator: str. The symbol to use for separating the query arguments. + Defaults to &. + Returns a dictionary. """ parsed_result = {} pairs = parse_qsl(qs, keep_blank_values, strict_parsing, encoding=encoding, errors=errors, - max_num_fields=max_num_fields) + max_num_fields=max_num_fields, separator=separator) for name, value in pairs: if name in parsed_result: parsed_result[name].append(value) @@ -701,7 +704,7 @@ def parse_qs(qs, keep_blank_values=False, strict_parsing=False, def parse_qsl(qs, keep_blank_values=False, strict_parsing=False, - encoding='utf-8', errors='replace', max_num_fields=None): + encoding='utf-8', errors='replace', max_num_fields=None, separator='&'): """Parse a query given as a string argument. Arguments: @@ -724,19 +727,26 @@ def parse_qsl(qs, keep_blank_values=False, strict_parsing=False, max_num_fields: int. If set, then throws a ValueError if there are more than n fields read by parse_qsl(). + separator: str. The symbol to use for separating the query arguments. + Defaults to &. + Returns a list, as G-d intended. """ qs, _coerce_result = _coerce_args(qs) + if not separator or (not isinstance(separator, str) + and not isinstance(separator, bytes)): + raise ValueError("Separator must be of type string or bytes.") + # If max_num_fields is defined then check that the number of fields # is less than max_num_fields. This prevents a memory exhaustion DOS # attack via post bodies with many fields. if max_num_fields is not None: - num_fields = 1 + qs.count('&') + qs.count(';') + num_fields = 1 + qs.count(separator) if max_num_fields < num_fields: raise ValueError('Max number of fields exceeded') - pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] + pairs = [s1 for s1 in qs.split(separator)] r = [] for name_value in pairs: if not name_value and not strict_parsing: diff --git a/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst b/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst new file mode 100644 index 00000000000000..f08489b41494ea --- /dev/null +++ b/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst @@ -0,0 +1 @@ +Fix web cache poisoning vulnerability by defaulting the query args separator to ``&``, and allowing the user to choose a custom separator. From c920ee5a3e124ef08c13213491c4327006505f92 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Sun, 14 Feb 2021 17:36:37 -0800 Subject: [PATCH 02/11] [3.9] bpo-42967: only use '&' as a query string separator (GH-24297) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Éric Araujo . (cherry picked from commit fcbe0cb04d35189401c0c880ebfb4311e952d776) Co-authored-by: Adam Goldschmidt From daef72bd6038a5dbff3eed84e1c817509be2535f Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 05:10:42 -0800 Subject: [PATCH 03/11] Update version information. --- Doc/library/cgi.rst | 2 +- Doc/library/urllib.parse.rst | 19 ++++++++++++------- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/Doc/library/cgi.rst b/Doc/library/cgi.rst index 05d9cdf424073f..e60a3f13595cf0 100644 --- a/Doc/library/cgi.rst +++ b/Doc/library/cgi.rst @@ -303,7 +303,7 @@ algorithms implemented in this module in other circumstances. Added the *encoding* and *errors* parameters. For non-file fields, the value is now a list of strings, not bytes. - .. versionchanged:: 3.10 + .. versionchanged:: 3.9.2 Added the *separator* parameter. diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 0e520371320f9d..aa992619604dea 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -190,7 +190,8 @@ or on combining URL components into a URL string. read. If set, then throws a :exc:`ValueError` if there are more than *max_num_fields* fields read. - The optional argument *separator* is the symbol to use for separating the query arguments. It defaults to `&`. + The optional argument *separator* is the symbol to use for separating the + query arguments. It defaults to `&`. Use the :func:`urllib.parse.urlencode` function (with the ``doseq`` parameter set to ``True``) to convert such dictionaries into query @@ -203,9 +204,11 @@ or on combining URL components into a URL string. .. versionchanged:: 3.8 Added *max_num_fields* parameter. - .. versionchanged:: 3.10 - Added *separator* parameter with the default value of `&`. Python versions earlier than Python 3.10 allowed using both ";" and "&" as - query parameter separator. This has been changed to allow only a single separator key, with "&" as the default separator. + .. versionchanged:: 3.9.2 + Added *separator* parameter with the default value of `&`. Python + versions earlier than Python 3.9.2 allowed using both ";" and "&" as + query parameter separator. This has been changed to allow only a single + separator key, with "&" as the default separator. .. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&') @@ -243,9 +246,11 @@ or on combining URL components into a URL string. .. versionchanged:: 3.8 Added *max_num_fields* parameter. - .. versionchanged:: 3.10 - Added *separator* parameter with the default value of `&`. Python versions earlier than Python 3.10 allowed using both ";" and "&" as - query parameter separator. This has been changed to allow only a single separator key, with "&" as the default separator. + .. versionchanged:: 3.9.2 + Added *separator* parameter with the default value of `&`. Python + versions earlier than Python 3.9.2 allowed using both ";" and "&" as + query parameter separator. This has been changed to allow only a single + separator key, with "&" as the default separator. .. function:: urlunparse(parts) From 95c9ce7aafe237516d3f475535ef990ab50317d6 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:04 -0800 Subject: [PATCH 04/11] Update Doc/library/urllib.parse.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/library/urllib.parse.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index aa992619604dea..462412d2df1d91 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -191,7 +191,7 @@ or on combining URL components into a URL string. *max_num_fields* fields read. The optional argument *separator* is the symbol to use for separating the - query arguments. It defaults to `&`. + query arguments. It defaults to ``&``. Use the :func:`urllib.parse.urlencode` function (with the ``doseq`` parameter set to ``True``) to convert such dictionaries into query From 81ce22d0d9509b58219029aedce19f911384584d Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:15 -0800 Subject: [PATCH 05/11] Update Doc/library/urllib.parse.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/library/urllib.parse.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 462412d2df1d91..8be06252869d7b 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -205,10 +205,10 @@ or on combining URL components into a URL string. Added *max_num_fields* parameter. .. versionchanged:: 3.9.2 - Added *separator* parameter with the default value of `&`. Python - versions earlier than Python 3.9.2 allowed using both ";" and "&" as + Added *separator* parameter with the default value of ``&``. Python + versions earlier than Python 3.9.2 allowed using both ``;`` and ``&`` as query parameter separator. This has been changed to allow only a single - separator key, with "&" as the default separator. + separator key, with ``&`` as the default separator. .. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace', max_num_fields=None, separator='&') From 8d08c605af69f7b94516eb0d8e8d8fbca7d5a206 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:22 -0800 Subject: [PATCH 06/11] Update Doc/library/urllib.parse.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/library/urllib.parse.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 8be06252869d7b..0e2e992b94da83 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -235,7 +235,7 @@ or on combining URL components into a URL string. read. If set, then throws a :exc:`ValueError` if there are more than *max_num_fields* fields read. - The optional argument *separator* is the symbol to use for separating the query arguments. It defaults to `&`. + The optional argument *separator* is the symbol to use for separating the query arguments. It defaults to ``&``. Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into query strings. From 6b53c442ffb6bfc94156a545f7c5225e07545579 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:29 -0800 Subject: [PATCH 07/11] Update Doc/library/urllib.parse.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/library/urllib.parse.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 0e2e992b94da83..38e2986334c807 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -247,10 +247,10 @@ or on combining URL components into a URL string. Added *max_num_fields* parameter. .. versionchanged:: 3.9.2 - Added *separator* parameter with the default value of `&`. Python - versions earlier than Python 3.9.2 allowed using both ";" and "&" as + Added *separator* parameter with the default value of ``&``. Python + versions earlier than Python 3.9.2 allowed using both ``;`` and ``&`` as query parameter separator. This has been changed to allow only a single - separator key, with "&" as the default separator. + separator key, with ``&`` as the default separator. .. function:: urlunparse(parts) From 10e4828075b6f986317274c2ea8af29b0c38f026 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:36 -0800 Subject: [PATCH 08/11] Update Doc/whatsnew/3.6.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/whatsnew/3.6.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/whatsnew/3.6.rst b/Doc/whatsnew/3.6.rst index 8a64da1b249d7d..03a877a3d91785 100644 --- a/Doc/whatsnew/3.6.rst +++ b/Doc/whatsnew/3.6.rst @@ -2447,11 +2447,11 @@ details, see the documentation for ``loop.create_datagram_endpoint()``. Notable changes in Python 3.6.13 ================================ -Earlier Python versions allowed using both ";" and "&" as +Earlier Python versions allowed using both ``;`` and ``&`` as query parameter separators in :func:`urllib.parse.parse_qs` and :func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with newer W3C recommendations, this has been changed to allow only a single -separator key, with "&" as the default. This change also affects +separator key, with ``&`` as the default. This change also affects :func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected functions internally. For more details, please see their respective documentation. From 21a2822d3ad912ffe5088b7884cf66aadd25b566 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:43 -0800 Subject: [PATCH 09/11] Update Doc/whatsnew/3.8.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/whatsnew/3.8.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Doc/whatsnew/3.8.rst b/Doc/whatsnew/3.8.rst index d21921d3dd51e7..dbc3875aae61b3 100644 --- a/Doc/whatsnew/3.8.rst +++ b/Doc/whatsnew/3.8.rst @@ -2238,12 +2238,12 @@ details, see the documentation for ``loop.create_datagram_endpoint()``. Notable changes in Python 3.8.8 =============================== -Earlier Python versions allowed using both ";" and "&" as +Earlier Python versions allowed using both ``;`` and ``&`` as query parameter separators in :func:`urllib.parse.parse_qs` and :func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with newer W3C recommendations, this has been changed to allow only a single -separator key, with "&" as the default. This change also affects +separator key, with ``&`` as the default. This change also affects :func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected functions internally. For more details, please see their respective documentation. -(Contributed by Adam Goldschmidt, Senthil Kumaran and Ken Jin in :issue:`42967`.) \ No newline at end of file +(Contributed by Adam Goldschmidt, Senthil Kumaran and Ken Jin in :issue:`42967`.) From 23792f66825dc8a03c336fcf2dbef9eb1b4062df Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:05:50 -0800 Subject: [PATCH 10/11] Update Doc/whatsnew/3.9.rst Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Doc/whatsnew/3.9.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/whatsnew/3.9.rst b/Doc/whatsnew/3.9.rst index 5f4f8ba211b180..3086930569dc98 100644 --- a/Doc/whatsnew/3.9.rst +++ b/Doc/whatsnew/3.9.rst @@ -1520,11 +1520,11 @@ become a :exc:`TypeError` in Python 3.10. urllib.parse ------------ -Earlier Python versions allowed using both ";" and "&" as +Earlier Python versions allowed using both ``;`` and ``&`` as query parameter separators in :func:`urllib.parse.parse_qs` and :func:`urllib.parse.parse_qsl`. Due to security concerns, and to conform with newer W3C recommendations, this has been changed to allow only a single -separator key, with "&" as the default. This change also affects +separator key, with ``&`` as the default. This change also affects :func:`cgi.parse` and :func:`cgi.parse_multipart` as they use the affected functions internally. For more details, please see their respective documentation. From eb39e4567cc67b8b6f70bb7510a8e96cd41ffa02 Mon Sep 17 00:00:00 2001 From: Senthil Kumaran Date: Mon, 15 Feb 2021 08:07:05 -0800 Subject: [PATCH 11/11] Update Lib/urllib/parse.py Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> --- Lib/urllib/parse.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/Lib/urllib/parse.py b/Lib/urllib/parse.py index 5bd067895bfa3d..335e183498d8bd 100644 --- a/Lib/urllib/parse.py +++ b/Lib/urllib/parse.py @@ -734,8 +734,7 @@ def parse_qsl(qs, keep_blank_values=False, strict_parsing=False, """ qs, _coerce_result = _coerce_args(qs) - if not separator or (not isinstance(separator, str) - and not isinstance(separator, bytes)): + if not separator or (not isinstance(separator, (str, bytes))): raise ValueError("Separator must be of type string or bytes.") # If max_num_fields is defined then check that the number of fields pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy