Skip to content

gh-137353: Add t-string support to gettext + pygettext #137354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ThiefMaster
Copy link

@ThiefMaster ThiefMaster commented Aug 3, 2025

Please see #137353 for details; TL;DR is that with this PR you can use t-strings for i18n, instead of having to call _(...).format(...)

print(_(t'Hello {name}'))
print(ngettext(t'{n} snake', t'{n} snakes', n))
print(_(t'Category "{cat.title}" moved to "{target_cat.title}"'))

@python-cla-bot
Copy link

python-cla-bot bot commented Aug 3, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Aug 3, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Copy link
Member

@StanFromIreland StanFromIreland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split this into separate PRs for pygettext and gettext. They also need blurbs.

@ThiefMaster
Copy link
Author

There would be some overlap between the two PRs, since parts of the code are required both for gettext and pygettext.

Do you still prefer having two separate PRs even in this early stage? I can do it of course, but my preference would be splitting it later, once any discussions that may come up are resolved and I made whatever changes may be necessary...

@bedevere-app
Copy link

bedevere-app bot commented Aug 3, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.



# utils for t-string handling in gettext translation + pygettext extraction
# TBD where they should go, and whether this should be a public API or internal,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should limit what is exposed in gettext apart from the core API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's why I underscored all of them for now.

FWIW, I think exposing at least the utils to convert a template string to a format string makes sense, because tools like Babel would need to use the exact same logic, or risk inconsistencies between implementations.

# beneficial to have in stdlib so any implementation can re-use it without
# risking diverging behavior for the same expression between implementations

class _NameTooComplexError(ValueError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This however should be IMO documented, since it is “public”. I however don’t like this, I think a general (new) gettext error (or, much simpler, a ValueError) would be clearer, thoughts, Tomas?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I just used this to avoid catching some other (unexpected) ValueError that may or may not come out of the ast visitor. A custom exception just for this may indeed be overkill.

But I like the idea of a GettextError :)

Lib/gettext.py Outdated
def _template_node_to_format(node: ast.TemplateStr) -> str:
"""Generate a format string from a template string AST node.

This fails with a :exc:`_NameTooComplexError` in case the expression is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings are not restructured text in CPython.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, lmk in case you don't like backticks around class names etc

@@ -38,6 +38,27 @@
bmsgd2luayAoaW4gIm15IG90aGVyIGNvbnRleHQiKQB3aW5rIHdpbmsA
'''

GNU_TMO_DATA = b'''\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not a particularly clear name IMO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I just stayed consistent w/ the existing ones. Updated to something more meaningful.

@StanFromIreland
Copy link
Member

StanFromIreland commented Aug 3, 2025

We generally tend to avoid mixing large changes to different modules (tool in this case) as it makes the PR much larger.

Also, please avoid force pushing, gh is unable to distinguish differences between them.

Comment on lines +141 to +142
_(t'Weird {meow[69j]}')
_(t'Weird {meow[...]}')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think these two are stupid and will never be used, but I didn't see much value in adding an extra check for certain type of Constant values to just to avoid those.


@property
def name(self) -> str:
name = '__'.join(self._name_parts)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used __ as a separator between parts since initially I thought that it might be nice to make it clearer that the placeholder for {user.name} isn't just something named user_name.

However, maybe using a single underscore would be fine here:

  • t'{user.name} {user_name}' would simply fail due to the check that a name doesn't map to different expressions
  • I can't come up with a good example where you would use foo.bar and foo_bar in the same string

Comment on lines +756 to +763
# We use this weird naming of the gettext functions here to allow
# easy extraction of the .po file using pygettext; see the comment
# next to the po file content near the bottom of this file on how
# to regenerate it.
self.gettexT = self.t.gettext
self.ngettexT = self.t.ngettext
self.pgettexT = self.t.pgettext
self.npgettexT = self.t.npgettext
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect to keep these weird names, but for the sake of making it easier to update the tests in the future, I'd probably go for something like t_gettext etc. in here.

@ThiefMaster
Copy link
Author

Also, please avoid force pushing, gh is unable to distinguish differences between them.

OK, will keep this in mind for future changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy