f-string debug expressions containing hash '#' are malformed

# Bug report

### Bug description:

There is a bug somewhere in f-string implementation starting around version 3.12, where the presence of a "#" and equal repr in the string causes leading string to be removed: E.G. 

f"{'#'=}"

gives

"''#'"

but should give

"'#'='#'".

**Note**: The following explanation was found by asking https://chatgpt.com/codex to locate the problem. This appears to me to be a correct explanation, but please use with caution.

The bug comes from the change that started stripping text after a “#” when capturing the expression text for an f-string debug expression (f'{expr=}'). This logic was introduced in commit d59feb5dbe5395615d06c30a95e6a6a9b7681d4d (“gh-112243: Don’t include comments in f-string debug expressions”) dated 2023‑11‑20.

Inside Parser/lexer/lexer.c, set_fstring_expr() now scans the expression buffer for “#” and removes everything from that point until the next newline. The relevant lines introduced in that commit are:

```c
 // Check if there is a # character in the expression
 int hash_detected = 0;
 for (Py_ssize_t i = 0; i < tok_mode->last_expr_size - tok_mode->last_expr_end; i++) {
     if (tok_mode->last_expr_buffer[i] == '#') {
         hash_detected = 1;
         break;
     }
 }

 if (hash_detected) {
     Py_ssize_t input_length = tok_mode->last_expr_size - tok_mode->last_expr_end;
     char *result = (char *)PyMem_Malloc((input_length + 1) * sizeof(char));
     ...
     for (i = 0, j = 0; i < input_length; i++) {
         if (tok_mode->last_expr_buffer[i] == '#') {
             // Skip characters until newline or end of string
             while (tok_mode->last_expr_buffer[i] != '\0' && i < input_length) {
                 if (tok_mode->last_expr_buffer[i] == '\n') {
                     result[j++] = tok_mode->last_expr_buffer[i];
                     break;
                 }
                 i++;
             }
         } else {
             result[j++] = tok_mode->last_expr_buffer[i];
         }
     }
     result[j] = '\0';
     res = PyUnicode_DecodeUTF8(result, j, NULL);
     PyMem_Free(result);
 } else {
     res = PyUnicode_DecodeUTF8(
         tok_mode->last_expr_buffer,
         tok_mode->last_expr_size - tok_mode->last_expr_end,
         NULL
     );
 }
```
Because this heuristic doesn’t check whether “#” is inside a quoted string, an expression such as '#' is mistakenly treated as starting a comment, leading to the debug string being truncated. This code was added in commit d59feb5dbe, visible in the repository’s history:

```bash
commit d59feb5dbe5395615d06c30a95e6a6a9b7681d4d
Author: Pablo Galindo Salgado <Pablogsal@gmail.com>
Date:   Mon Nov 20 15:18:24 2023 +0000

    gh-112243: Don't include comments in f-string debug expressions (#112284)
```

Therefore the likely cause of the bug appeared in commit d59feb5dbe5395615d06c30a95e6a6a9b7681d4d, modifying Parser/lexer/lexer.c. This commit landed in the 3.12 development cycle and introduced the faulty handling of “#” inside f-string debug expressions.

### CPython versions tested on:

3.12

### Operating systems tested on:

Linux

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

f-string debug expressions containing hash '#' are malformed #137182

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Uh oh!

f-string debug expressions containing hash '#' are malformed #137182

Description

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.