Skip to content

Commit 008bbfd

Browse files
authored
Disallow implicit concatenation of t-strings and other string types (#19485)
As of [this cpython PR](python/cpython#135996), it is not allowed to concatenate t-strings with non-t-strings, implicitly or explicitly. Expressions such as `"foo" t"{bar}"` are now syntax errors. This PR updates some AST nodes and parsing to reflect this change. The structural change is that `TStringPart` is no longer needed, since, as in the case of `BytesStringLiteral`, the only possibilities are that we have a single `TString` or a vector of such (representing an implicit concatenation of t-strings). This removes a level of nesting from many AST expressions (which is what all the snapshot changes reflect), and simplifies some logic in the implementation of visitors, for example. The other change of note is in the parser. When we meet an implicit concatenation of string-like literals, we now count the number of t-string literals. If these do not exhaust the total number of implicitly concatenated pieces, then we emit a syntax error. To recover from this syntax error, we encode any t-string pieces as _invalid_ string literals (which means we flag them as invalid, record their range, and record the value as `""`). Note that if at least one of the pieces is an f-string we prefer to parse the entire string as an f-string; otherwise we parse it as a string. This logic is exactly the same as how we currently treat `BytesStringLiteral` parsing and error recovery - and carries with it the same pros and cons. Finally, note that I have not implemented any changes in the implementation of the formatter. As far as I can tell, none are needed. I did change a few of the fixtures so that we are always concatenating t-strings with t-strings.
1 parent df5eba7 commit 008bbfd

File tree

75 files changed

+4319
-6104
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+4319
-6104
lines changed

crates/ruff_linter/resources/test/fixtures/flake8_bandit/S104.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,5 @@ def my_func():
2525

2626
# t-strings - all ok
2727
t"0.0.0.0"
28-
"0.0.0.0" t"0.0.0.0{expr}0.0.0.0"
29-
"0.0.0.0" f"0.0.0.0{expr}0.0.0.0" t"0.0.0.0{expr}0.0.0.0"
28+
t"0.0.0.0" t"0.0.0.0{expr}0.0.0.0"
29+
t"0.0.0.0" t"0.0.0.0{expr}0.0.0.0" t"0.0.0.0{expr}0.0.0.0"

crates/ruff_python_ast/src/comparable.rs

Lines changed: 9 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -708,23 +708,10 @@ pub struct ComparableTString<'a> {
708708
}
709709

710710
impl<'a> From<&'a ast::TStringValue> for ComparableTString<'a> {
711-
// The approach taken below necessarily deviates from the
712-
// corresponding implementation for [`ast::FStringValue`].
713-
// The reason is that a t-string value is composed of _three_
714-
// non-comparable parts: literals, f-string expressions, and
715-
// t-string interpolations. Since we have merged the AST nodes
716-
// that capture f-string expressions and t-string interpolations
717-
// into the shared [`ast::InterpolatedElement`], we must
718-
// be careful to distinguish between them here.
711+
// We model a [`ComparableTString`] on the actual
712+
// [CPython implementation] of a `string.templatelib.Template` object.
719713
//
720-
// Consequently, we model a [`ComparableTString`] on the actual
721-
// [CPython implementation] of a `string.templatelib.Template` object:
722-
// it is composed of `strings` and `interpolations`. In CPython,
723-
// the `strings` field is a tuple of honest strings (since f-strings
724-
// are evaluated). Our `strings` field will house both f-string
725-
// expressions and string literals.
726-
//
727-
// Finally, as in CPython, we must be careful to ensure that the length
714+
// As in CPython, we must be careful to ensure that the length
728715
// of `strings` is always one more than the length of `interpolations` -
729716
// that way we can recover the original reading order by interleaving
730717
// starting with `strings`. This is how we can tell the
@@ -768,19 +755,6 @@ impl<'a> From<&'a ast::TStringValue> for ComparableTString<'a> {
768755
.push(ComparableInterpolatedStringElement::Literal("".into()));
769756
}
770757

771-
fn push_fstring_expression(&mut self, expression: &'a ast::InterpolatedElement) {
772-
if let Some(ComparableInterpolatedStringElement::Literal(last_literal)) =
773-
self.strings.last()
774-
{
775-
// Recall that we insert empty strings after
776-
// each interpolation. If we encounter an f-string
777-
// expression, we replace the empty string with it.
778-
if last_literal.is_empty() {
779-
self.strings.pop();
780-
}
781-
}
782-
self.strings.push(expression.into());
783-
}
784758
fn push_tstring_interpolation(&mut self, expression: &'a ast::InterpolatedElement) {
785759
self.interpolations.push(expression.into());
786760
self.start_new_literal();
@@ -789,34 +763,13 @@ impl<'a> From<&'a ast::TStringValue> for ComparableTString<'a> {
789763

790764
let mut collector = Collector::default();
791765

792-
for part in value {
793-
match part {
794-
ast::TStringPart::Literal(string_literal) => {
795-
collector.push_literal(&string_literal.value);
796-
}
797-
ast::TStringPart::TString(fstring) => {
798-
for element in &fstring.elements {
799-
match element {
800-
ast::InterpolatedStringElement::Literal(literal) => {
801-
collector.push_literal(&literal.value);
802-
}
803-
ast::InterpolatedStringElement::Interpolation(interpolation) => {
804-
collector.push_tstring_interpolation(interpolation);
805-
}
806-
}
807-
}
766+
for element in value.elements() {
767+
match element {
768+
ast::InterpolatedStringElement::Literal(literal) => {
769+
collector.push_literal(&literal.value);
808770
}
809-
ast::TStringPart::FString(fstring) => {
810-
for element in &fstring.elements {
811-
match element {
812-
ast::InterpolatedStringElement::Literal(literal) => {
813-
collector.push_literal(&literal.value);
814-
}
815-
ast::InterpolatedStringElement::Interpolation(expression) => {
816-
collector.push_fstring_expression(expression);
817-
}
818-
}
819-
}
771+
ast::InterpolatedStringElement::Interpolation(interpolation) => {
772+
collector.push_tstring_interpolation(interpolation);
820773
}
821774
}
822775
}

crates/ruff_python_ast/src/expression.rs

Lines changed: 3 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -320,7 +320,7 @@ pub enum StringLikePartIter<'a> {
320320
String(std::slice::Iter<'a, ast::StringLiteral>),
321321
Bytes(std::slice::Iter<'a, ast::BytesLiteral>),
322322
FString(std::slice::Iter<'a, ast::FStringPart>),
323-
TString(std::slice::Iter<'a, ast::TStringPart>),
323+
TString(std::slice::Iter<'a, ast::TString>),
324324
}
325325

326326
impl<'a> Iterator for StringLikePartIter<'a> {
@@ -339,16 +339,7 @@ impl<'a> Iterator for StringLikePartIter<'a> {
339339
ast::FStringPart::FString(f_string) => StringLikePart::FString(f_string),
340340
}
341341
}
342-
StringLikePartIter::TString(inner) => {
343-
let part = inner.next()?;
344-
match part {
345-
ast::TStringPart::Literal(string_literal) => {
346-
StringLikePart::String(string_literal)
347-
}
348-
ast::TStringPart::TString(t_string) => StringLikePart::TString(t_string),
349-
ast::TStringPart::FString(f_string) => StringLikePart::FString(f_string),
350-
}
351-
}
342+
StringLikePartIter::TString(inner) => StringLikePart::TString(inner.next()?),
352343
};
353344

354345
Some(part)
@@ -378,16 +369,7 @@ impl DoubleEndedIterator for StringLikePartIter<'_> {
378369
ast::FStringPart::FString(f_string) => StringLikePart::FString(f_string),
379370
}
380371
}
381-
StringLikePartIter::TString(inner) => {
382-
let part = inner.next_back()?;
383-
match part {
384-
ast::TStringPart::Literal(string_literal) => {
385-
StringLikePart::String(string_literal)
386-
}
387-
ast::TStringPart::TString(t_string) => StringLikePart::TString(t_string),
388-
ast::TStringPart::FString(f_string) => StringLikePart::FString(f_string),
389-
}
390-
}
372+
StringLikePartIter::TString(inner) => StringLikePart::TString(inner.next_back()?),
391373
};
392374

393375
Some(part)

crates/ruff_python_ast/src/helpers.rs

Lines changed: 2 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1274,6 +1274,7 @@ impl Truthiness {
12741274
Self::Unknown
12751275
}
12761276
}
1277+
Expr::TString(_) => Self::Truthy,
12771278
Expr::List(ast::ExprList { elts, .. })
12781279
| Expr::Set(ast::ExprSet { elts, .. })
12791280
| Expr::Tuple(ast::ExprTuple { elts, .. }) => {
@@ -1362,6 +1363,7 @@ fn is_non_empty_f_string(expr: &ast::ExprFString) -> bool {
13621363
Expr::EllipsisLiteral(_) => true,
13631364
Expr::List(_) => true,
13641365
Expr::Tuple(_) => true,
1366+
Expr::TString(_) => true,
13651367

13661368
// These expressions must resolve to the inner expression.
13671369
Expr::If(ast::ExprIf { body, orelse, .. }) => inner(body) && inner(orelse),
@@ -1386,7 +1388,6 @@ fn is_non_empty_f_string(expr: &ast::ExprFString) -> bool {
13861388
// These literals may or may not be empty.
13871389
Expr::FString(f_string) => is_non_empty_f_string(f_string),
13881390
// These literals may or may not be empty.
1389-
Expr::TString(f_string) => is_non_empty_t_string(f_string),
13901391
Expr::StringLiteral(ast::ExprStringLiteral { value, .. }) => !value.is_empty(),
13911392
Expr::BytesLiteral(ast::ExprBytesLiteral { value, .. }) => !value.is_empty(),
13921393
}
@@ -1403,76 +1404,6 @@ fn is_non_empty_f_string(expr: &ast::ExprFString) -> bool {
14031404
})
14041405
}
14051406

1406-
/// Returns `true` if the expression definitely resolves to a non-empty string, when used as an
1407-
/// f-string expression, or `false` if the expression may resolve to an empty string.
1408-
fn is_non_empty_t_string(expr: &ast::ExprTString) -> bool {
1409-
fn inner(expr: &Expr) -> bool {
1410-
match expr {
1411-
// When stringified, these expressions are always non-empty.
1412-
Expr::Lambda(_) => true,
1413-
Expr::Dict(_) => true,
1414-
Expr::Set(_) => true,
1415-
Expr::ListComp(_) => true,
1416-
Expr::SetComp(_) => true,
1417-
Expr::DictComp(_) => true,
1418-
Expr::Compare(_) => true,
1419-
Expr::NumberLiteral(_) => true,
1420-
Expr::BooleanLiteral(_) => true,
1421-
Expr::NoneLiteral(_) => true,
1422-
Expr::EllipsisLiteral(_) => true,
1423-
Expr::List(_) => true,
1424-
Expr::Tuple(_) => true,
1425-
1426-
// These expressions must resolve to the inner expression.
1427-
Expr::If(ast::ExprIf { body, orelse, .. }) => inner(body) && inner(orelse),
1428-
Expr::Named(ast::ExprNamed { value, .. }) => inner(value),
1429-
1430-
// These expressions are complex. We can't determine whether they're empty or not.
1431-
Expr::BoolOp(ast::ExprBoolOp { .. }) => false,
1432-
Expr::BinOp(ast::ExprBinOp { .. }) => false,
1433-
Expr::UnaryOp(ast::ExprUnaryOp { .. }) => false,
1434-
Expr::Generator(_) => false,
1435-
Expr::Await(_) => false,
1436-
Expr::Yield(_) => false,
1437-
Expr::YieldFrom(_) => false,
1438-
Expr::Call(_) => false,
1439-
Expr::Attribute(_) => false,
1440-
Expr::Subscript(_) => false,
1441-
Expr::Starred(_) => false,
1442-
Expr::Name(_) => false,
1443-
Expr::Slice(_) => false,
1444-
Expr::IpyEscapeCommand(_) => false,
1445-
1446-
// These literals may or may not be empty.
1447-
Expr::FString(f_string) => is_non_empty_f_string(f_string),
1448-
// These literals may or may not be empty.
1449-
Expr::TString(t_string) => is_non_empty_t_string(t_string),
1450-
Expr::StringLiteral(ast::ExprStringLiteral { value, .. }) => !value.is_empty(),
1451-
Expr::BytesLiteral(ast::ExprBytesLiteral { value, .. }) => !value.is_empty(),
1452-
}
1453-
}
1454-
1455-
expr.value.iter().any(|part| match part {
1456-
ast::TStringPart::Literal(string_literal) => !string_literal.is_empty(),
1457-
ast::TStringPart::TString(t_string) => {
1458-
t_string.elements.iter().all(|element| match element {
1459-
ast::InterpolatedStringElement::Literal(string_literal) => {
1460-
!string_literal.is_empty()
1461-
}
1462-
ast::InterpolatedStringElement::Interpolation(t_string) => {
1463-
inner(&t_string.expression)
1464-
}
1465-
})
1466-
}
1467-
ast::TStringPart::FString(f_string) => {
1468-
f_string.elements.iter().all(|element| match element {
1469-
InterpolatedStringElement::Literal(string_literal) => !string_literal.is_empty(),
1470-
InterpolatedStringElement::Interpolation(f_string) => inner(&f_string.expression),
1471-
})
1472-
}
1473-
})
1474-
}
1475-
14761407
/// Returns `true` if the expression definitely resolves to the empty string, when used as an f-string
14771408
/// expression.
14781409
fn is_empty_f_string(expr: &ast::ExprFString) -> bool {

crates/ruff_python_ast/src/node.rs

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -171,18 +171,8 @@ impl ast::ExprTString {
171171
node_index: _,
172172
} = self;
173173

174-
for t_string_part in value {
175-
match t_string_part {
176-
ast::TStringPart::Literal(string_literal) => {
177-
visitor.visit_string_literal(string_literal);
178-
}
179-
ast::TStringPart::FString(f_string) => {
180-
visitor.visit_f_string(f_string);
181-
}
182-
ast::TStringPart::TString(t_string) => {
183-
visitor.visit_t_string(t_string);
184-
}
185-
}
174+
for t_string in value {
175+
visitor.visit_t_string(t_string);
186176
}
187177
}
188178
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy