Skip to content

bpo-35805: Add parser for Message-ID email header. #13397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 4, 2019

Conversation

maxking
Copy link
Contributor

@maxking maxking commented May 17, 2019

This parser is based on the definition of Identification Fields from RFC 5322
Sec 3.6.4.

This should also prevent folding of Message-ID header using RFC 2047 encoded
words and hence fix bpo-35805.

https://bugs.python.org/issue35805

@maxking
Copy link
Contributor Author

maxking commented May 17, 2019

This only adds parser for Message-ID header for now. After this is reviewed/merged I can work on adding other "Identification Fields" from RFC 5322 using the same parser for msg_id.

Copy link
Member

@warsaw warsaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize sadly I no longer remember how this all works, but based on my recollection, this looks good to me. I would be interested in @bitdancer 's opinion too.

@csabella csabella requested a review from bitdancer May 18, 2019 01:51
@maxking maxking force-pushed the bpo-35805 branch 2 times, most recently from 438f548 to 244541f Compare May 18, 2019 06:57
Copy link
Member

@bitdancer bitdancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great. Thanks!

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@bitdancer
Copy link
Member

@warsaw: I'm not sure I remember how it all works, and I wrote it. As far as I can tell @maxking did a great job of figuring out the principles behind this rather baroque parser :)

For the record, this parser is essentially a "first draft" (or maybe a 1.5 draft, I forget), which I had hoped to come back to, extract what I learned from writing it, and write a version 2 that was more consistent and easier to understand (and most importantly would produce a parse tree that was easier to interrogate and manipulate). But at this point it seems unlikely I'll ever manage to find time to do that.

This parser is based on the definition of Identification Fields from RFC 5322
Sec 3.6.4.

This should also prevent folding of Message-ID header using RFC 2047 encoded
words and hence fix bpo-35805.
Also, remove empty lines from classes that don't have any methods.
@maxking
Copy link
Contributor Author

maxking commented May 18, 2019

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@bitdancer, @warsaw: please review the changes made to this pull request.

@maxking
Copy link
Contributor Author

maxking commented May 27, 2019

@bitdancer I have made the changes you requested for the folding of msg-id tokens.

@csabella csabella requested a review from bitdancer June 1, 2019 12:05
@warsaw warsaw dismissed bitdancer’s stale review June 4, 2019 17:40

Reviewing in real-time w/maxking, we think this is ready to land

@warsaw warsaw merged commit 46d88a1 into python:master Jun 4, 2019
except errors.HeaderParseError:
message_id.defects.append(errors.InvalidHeaderDefect(
"Expected msg-id but found {!r}".format(value)))
message_id.append(token)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to bomb out when building hyper kitty, because token is referenced before assignment, when try block fails.

======================================================================
ERROR: test_long_message_id (hyperkitty.tests.lib.test_incoming.TestAddToList)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./hyperkitty/tests/lib/test_incoming.py", line 295, in test_long_message_id
    msg["Message-ID"] = "X" * 260
  File "/usr/lib/python3.8/email/message.py", line 409, in __setitem__
    self._headers.append(self.policy.header_store_parse(name, val))
  File "/usr/lib/python3.8/email/policy.py", line 148, in header_store_parse
    return (name, self.header_factory(name, value))
  File "/usr/lib/python3.8/email/headerregistry.py", line 602, in __call__
    return self[name](name, value)
  File "/usr/lib/python3.8/email/headerregistry.py", line 197, in __new__
    cls.parse(value, kwds)
  File "/usr/lib/python3.8/email/headerregistry.py", line 530, in parse
    kwds['parse_tree'] = parse_tree = cls.value_parser(value)
  File "/usr/lib/python3.8/email/_header_value_parser.py", line 2116, in parse_message_id
    message_id.append(token)
UnboundLocalError: local variable 'token' referenced before assignment

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, same for me. Also, accessing value[0] before checking if value is there…

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@surkova Thank you! I think this should be fixed up properly. Opened BPO https://bugs.python.org/issue38708

DinoV pushed a commit to DinoV/cpython that referenced this pull request Jan 14, 2020
* bpo-35805: Add parser for Message-ID header.

This parser is based on the definition of Identification Fields from RFC 5322
Sec 3.6.4.

This should also prevent folding of Message-ID header using RFC 2047 encoded
words and hence fix bpo-35805.

* Prevent folding of non-ascii message-id headers.
* Add fold method to MsgID token to prevent folding.
@odony
Copy link

odony commented Aug 6, 2020

After this is reviewed/merged I can work on adding other "Identification Fields" from RFC 5322 using the same parser for msg_id.

@maxking was there any progress made on fixing the folding of other identification fields elsewhere by any chance? Or is this still an open issue? Thanks!

odony added a commit to odoo-dev/odoo that referenced this pull request Aug 7, 2020
Python 3 before 3.8 has a bug that causes the email.policy classes to
incorrectly fold and RFC2047-encode "identification fields" in email
messages. This mainly applies to Message-Id, References, and In-Reply-To
fields.

We are impacted by this bug since odoo#35929 where we switched to
using the "modern" email.message API.

RFC2047 section 5 clearly states that those headers/fields are not to be
encoded, and that would violate RFC5322.

Further, such a folded Message-Id is considered non-RFC-conformant by
popular MTAs (GMail, Outlook), which will then generate *another*
Message-Id field, causing the original threading information to be lost.
Replies to such a modified message will reference the new, unknown
Message-Id, and won't be attached to the original thread.

The solution we adopt here is to monkey-patch the SMTP policies to
special-case those identification fields and deactivate the automatic
folding, until the bug is properly and fully fixed in the standard lib.

Some considerations taken into account for this patch:

- `email.policy.SMTP` is being monkey-patched globally to make sure we
  fix all possible places where Messages are being encoded/folded
- the fix is **not** made version-specific, considering that even in Python
  3.8 the official bugfix only applies to Message-Id, but still fails to
  protect other identification fields, like *References* and
  *In-Reply-To*. The author specifically noted that shortcoming [2].
  The fix wouldn't break anything on Python 3.8 anyway.
- the `noFoldPolicy` trick for preventing folding is done with no max
  line length at all. RFC5322, section 2.1.1 states [3] that the maximum
  length is 998 due to legacy implementations, but there is no provision
  to wrap identification fields that are longer than that. Wrapping at
  998 chars would corrupt the header anyway. We'll just count on the
  fact that we don't usually need 1k+ chars in those headers.

The invalid folding/encoding in action on Python 3.6 (in Python 3.8 only
the second header gets folded):

```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: =?utf-8?q?=3C929227342217024=2E1596730490=2E324691772460938-exam?=
 =?utf-8?q?ple-30661-some=2Ereference=40test-123=2Eexample=2Ecom=3E?=
In-Reply-To: =?utf-8?q?=3C92922734221723=2E1596730568=2E324691772460444-anot?=
 =?utf-8?q?her-30661-parent=2Ereference=40test-123=2Eexample=2Ecom=3E?=

```

and the expected result after the fix:
```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: <929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>
In-Reply-To: <92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>

```

[1] bpo-35805: https://bugs.python.org/issue35805
[2] python/cpython#13397 (comment)
[3] https://tools.ietf.org/html/rfc5322#section-2.1.1
robodoo pushed a commit to odoo/odoo that referenced this pull request Aug 9, 2020
Python 3 before 3.8 has a bug that causes the email.policy classes to
incorrectly fold and RFC2047-encode "identification fields" in email
messages. This mainly applies to Message-Id, References, and In-Reply-To
fields.

We are impacted by this bug since #35929 where we switched to
using the "modern" email.message API.

RFC2047 section 5 clearly states that those headers/fields are not to be
encoded, and that would violate RFC5322.

Further, such a folded Message-Id is considered non-RFC-conformant by
popular MTAs (GMail, Outlook), which will then generate *another*
Message-Id field, causing the original threading information to be lost.
Replies to such a modified message will reference the new, unknown
Message-Id, and won't be attached to the original thread.

The solution we adopt here is to monkey-patch the SMTP policies to
special-case those identification fields and deactivate the automatic
folding, until the bug is properly and fully fixed in the standard lib.

Some considerations taken into account for this patch:

- `email.policy.SMTP` is being monkey-patched globally to make sure we
  fix all possible places where Messages are being encoded/folded
- the fix is **not** made version-specific, considering that even in Python
  3.8 the official bugfix only applies to Message-Id, but still fails to
  protect other identification fields, like *References* and
  *In-Reply-To*. The author specifically noted that shortcoming [2].
  The fix wouldn't break anything on Python 3.8 anyway.
- the `noFoldPolicy` trick for preventing folding is done with no max
  line length at all. RFC5322, section 2.1.1 states [3] that the maximum
  length is 998 due to legacy implementations, but there is no provision
  to wrap identification fields that are longer than that. Wrapping at
  998 chars would corrupt the header anyway. We'll just count on the
  fact that we don't usually need 1k+ chars in those headers.

The invalid folding/encoding in action on Python 3.6 (in Python 3.8 only
the second header gets folded):

```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: =?utf-8?q?=3C929227342217024=2E1596730490=2E324691772460938-exam?=
 =?utf-8?q?ple-30661-some=2Ereference=40test-123=2Eexample=2Ecom=3E?=
In-Reply-To: =?utf-8?q?=3C92922734221723=2E1596730568=2E324691772460444-anot?=
 =?utf-8?q?her-30661-parent=2Ereference=40test-123=2Eexample=2Ecom=3E?=

```

and the expected result after the fix:
```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: <929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>
In-Reply-To: <92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>

```

[1] bpo-35805: https://bugs.python.org/issue35805
[2] python/cpython#13397 (comment)
[3] https://tools.ietf.org/html/rfc5322#section-2.1.1

closes #55609

Signed-off-by: Olivier Dony (odo) <odo@openerp.com>
odony added a commit to odoo-dev/odoo that referenced this pull request Aug 9, 2020
Python 3 before 3.8 has a bug that causes the email.policy classes to
incorrectly fold and RFC2047-encode "identification fields" in email
messages. This mainly applies to Message-Id, References, and In-Reply-To
fields.

We are impacted by this bug since odoo#35929 where we switched to
using the "modern" email.message API.

RFC2047 section 5 clearly states that those headers/fields are not to be
encoded, and that would violate RFC5322.

Further, such a folded Message-Id is considered non-RFC-conformant by
popular MTAs (GMail, Outlook), which will then generate *another*
Message-Id field, causing the original threading information to be lost.
Replies to such a modified message will reference the new, unknown
Message-Id, and won't be attached to the original thread.

The solution we adopt here is to monkey-patch the SMTP policies to
special-case those identification fields and deactivate the automatic
folding, until the bug is properly and fully fixed in the standard lib.

Some considerations taken into account for this patch:

- `email.policy.SMTP` is being monkey-patched globally to make sure we
  fix all possible places where Messages are being encoded/folded
- the fix is **not** made version-specific, considering that even in Python
  3.8 the official bugfix only applies to Message-Id, but still fails to
  protect other identification fields, like *References* and
  *In-Reply-To*. The author specifically noted that shortcoming [2].
  The fix wouldn't break anything on Python 3.8 anyway.
- the `noFoldPolicy` trick for preventing folding is done with no max
  line length at all. RFC5322, section 2.1.1 states [3] that the maximum
  length is 998 due to legacy implementations, but there is no provision
  to wrap identification fields that are longer than that. Wrapping at
  998 chars would corrupt the header anyway. We'll just count on the
  fact that we don't usually need 1k+ chars in those headers.

The invalid folding/encoding in action on Python 3.6 (in Python 3.8 only
the second header gets folded):

```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: =?utf-8?q?=3C929227342217024=2E1596730490=2E324691772460938-exam?=
 =?utf-8?q?ple-30661-some=2Ereference=40test-123=2Eexample=2Ecom=3E?=
In-Reply-To: =?utf-8?q?=3C92922734221723=2E1596730568=2E324691772460444-anot?=
 =?utf-8?q?her-30661-parent=2Ereference=40test-123=2Eexample=2Ecom=3E?=

```

and the expected result after the fix:
```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: <929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>
In-Reply-To: <92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>

```

[1] bpo-35805: https://bugs.python.org/issue35805
[2] python/cpython#13397 (comment)
[3] https://tools.ietf.org/html/rfc5322#section-2.1.1

X-original-commit: 6726e9a
robodoo pushed a commit to odoo/odoo that referenced this pull request Aug 9, 2020
Python 3 before 3.8 has a bug that causes the email.policy classes to
incorrectly fold and RFC2047-encode "identification fields" in email
messages. This mainly applies to Message-Id, References, and In-Reply-To
fields.

We are impacted by this bug since #35929 where we switched to
using the "modern" email.message API.

RFC2047 section 5 clearly states that those headers/fields are not to be
encoded, and that would violate RFC5322.

Further, such a folded Message-Id is considered non-RFC-conformant by
popular MTAs (GMail, Outlook), which will then generate *another*
Message-Id field, causing the original threading information to be lost.
Replies to such a modified message will reference the new, unknown
Message-Id, and won't be attached to the original thread.

The solution we adopt here is to monkey-patch the SMTP policies to
special-case those identification fields and deactivate the automatic
folding, until the bug is properly and fully fixed in the standard lib.

Some considerations taken into account for this patch:

- `email.policy.SMTP` is being monkey-patched globally to make sure we
  fix all possible places where Messages are being encoded/folded
- the fix is **not** made version-specific, considering that even in Python
  3.8 the official bugfix only applies to Message-Id, but still fails to
  protect other identification fields, like *References* and
  *In-Reply-To*. The author specifically noted that shortcoming [2].
  The fix wouldn't break anything on Python 3.8 anyway.
- the `noFoldPolicy` trick for preventing folding is done with no max
  line length at all. RFC5322, section 2.1.1 states [3] that the maximum
  length is 998 due to legacy implementations, but there is no provision
  to wrap identification fields that are longer than that. Wrapping at
  998 chars would corrupt the header anyway. We'll just count on the
  fact that we don't usually need 1k+ chars in those headers.

The invalid folding/encoding in action on Python 3.6 (in Python 3.8 only
the second header gets folded):

```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: =?utf-8?q?=3C929227342217024=2E1596730490=2E324691772460938-exam?=
 =?utf-8?q?ple-30661-some=2Ereference=40test-123=2Eexample=2Ecom=3E?=
In-Reply-To: =?utf-8?q?=3C92922734221723=2E1596730568=2E324691772460444-anot?=
 =?utf-8?q?her-30661-parent=2Ereference=40test-123=2Eexample=2Ecom=3E?=

```

and the expected result after the fix:
```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: <929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>
In-Reply-To: <92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>

```

[1] bpo-35805: https://bugs.python.org/issue35805
[2] python/cpython#13397 (comment)
[3] https://tools.ietf.org/html/rfc5322#section-2.1.1

closes #55655

X-original-commit: 6726e9a
Signed-off-by: Olivier Dony (odo) <odo@openerp.com>
fw-bot pushed a commit to odoo-dev/odoo that referenced this pull request Aug 9, 2020
Python 3 before 3.8 has a bug that causes the email.policy classes to
incorrectly fold and RFC2047-encode "identification fields" in email
messages. This mainly applies to Message-Id, References, and In-Reply-To
fields.

We are impacted by this bug since odoo#35929 where we switched to
using the "modern" email.message API.

RFC2047 section 5 clearly states that those headers/fields are not to be
encoded, and that would violate RFC5322.

Further, such a folded Message-Id is considered non-RFC-conformant by
popular MTAs (GMail, Outlook), which will then generate *another*
Message-Id field, causing the original threading information to be lost.
Replies to such a modified message will reference the new, unknown
Message-Id, and won't be attached to the original thread.

The solution we adopt here is to monkey-patch the SMTP policies to
special-case those identification fields and deactivate the automatic
folding, until the bug is properly and fully fixed in the standard lib.

Some considerations taken into account for this patch:

- `email.policy.SMTP` is being monkey-patched globally to make sure we
  fix all possible places where Messages are being encoded/folded
- the fix is **not** made version-specific, considering that even in Python
  3.8 the official bugfix only applies to Message-Id, but still fails to
  protect other identification fields, like *References* and
  *In-Reply-To*. The author specifically noted that shortcoming [2].
  The fix wouldn't break anything on Python 3.8 anyway.
- the `noFoldPolicy` trick for preventing folding is done with no max
  line length at all. RFC5322, section 2.1.1 states [3] that the maximum
  length is 998 due to legacy implementations, but there is no provision
  to wrap identification fields that are longer than that. Wrapping at
  998 chars would corrupt the header anyway. We'll just count on the
  fact that we don't usually need 1k+ chars in those headers.

The invalid folding/encoding in action on Python 3.6 (in Python 3.8 only
the second header gets folded):

```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: =?utf-8?q?=3C929227342217024=2E1596730490=2E324691772460938-exam?=
 =?utf-8?q?ple-30661-some=2Ereference=40test-123=2Eexample=2Ecom=3E?=
In-Reply-To: =?utf-8?q?=3C92922734221723=2E1596730568=2E324691772460444-anot?=
 =?utf-8?q?her-30661-parent=2Ereference=40test-123=2Eexample=2Ecom=3E?=

```

and the expected result after the fix:
```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: <929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>
In-Reply-To: <92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>

```

[1] bpo-35805: https://bugs.python.org/issue35805
[2] python/cpython#13397 (comment)
[3] https://tools.ietf.org/html/rfc5322#section-2.1.1

X-original-commit: 02b7877
robodoo pushed a commit to odoo/odoo that referenced this pull request Aug 10, 2020
Python 3 before 3.8 has a bug that causes the email.policy classes to
incorrectly fold and RFC2047-encode "identification fields" in email
messages. This mainly applies to Message-Id, References, and In-Reply-To
fields.

We are impacted by this bug since #35929 where we switched to
using the "modern" email.message API.

RFC2047 section 5 clearly states that those headers/fields are not to be
encoded, and that would violate RFC5322.

Further, such a folded Message-Id is considered non-RFC-conformant by
popular MTAs (GMail, Outlook), which will then generate *another*
Message-Id field, causing the original threading information to be lost.
Replies to such a modified message will reference the new, unknown
Message-Id, and won't be attached to the original thread.

The solution we adopt here is to monkey-patch the SMTP policies to
special-case those identification fields and deactivate the automatic
folding, until the bug is properly and fully fixed in the standard lib.

Some considerations taken into account for this patch:

- `email.policy.SMTP` is being monkey-patched globally to make sure we
  fix all possible places where Messages are being encoded/folded
- the fix is **not** made version-specific, considering that even in Python
  3.8 the official bugfix only applies to Message-Id, but still fails to
  protect other identification fields, like *References* and
  *In-Reply-To*. The author specifically noted that shortcoming [2].
  The fix wouldn't break anything on Python 3.8 anyway.
- the `noFoldPolicy` trick for preventing folding is done with no max
  line length at all. RFC5322, section 2.1.1 states [3] that the maximum
  length is 998 due to legacy implementations, but there is no provision
  to wrap identification fields that are longer than that. Wrapping at
  998 chars would corrupt the header anyway. We'll just count on the
  fact that we don't usually need 1k+ chars in those headers.

The invalid folding/encoding in action on Python 3.6 (in Python 3.8 only
the second header gets folded):

```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: =?utf-8?q?=3C929227342217024=2E1596730490=2E324691772460938-exam?=
 =?utf-8?q?ple-30661-some=2Ereference=40test-123=2Eexample=2Ecom=3E?=
In-Reply-To: =?utf-8?q?=3C92922734221723=2E1596730568=2E324691772460444-anot?=
 =?utf-8?q?her-30661-parent=2Ereference=40test-123=2Eexample=2Ecom=3E?=

```

and the expected result after the fix:
```py
>>> msg = email.message.EmailMessage(policy=email.policy.SMTP)
>>> msg['Message-Id'] = '<929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>'
>>> msg['In-Reply-To'] = '<92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>'
>>> print(msg.as_string())
Message-Id: <929227342217024.1596730490.324691772460938-example-30661-some.reference@test-123.example.com>
In-Reply-To: <92922734221723.1596730568.324691772460444-another-30661-parent.reference@test-123.example.com>

```

[1] bpo-35805: https://bugs.python.org/issue35805
[2] python/cpython#13397 (comment)
[3] https://tools.ietf.org/html/rfc5322#section-2.1.1

closes #55656

X-original-commit: 02b7877
Signed-off-by: Olivier Dony (odo) <odo@openerp.com>
@maxking maxking deleted the bpo-35805 branch August 21, 2020 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy