Skip to content

Function keyword syntax restriction #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

RayaneCTX
Copy link

@RayaneCTX RayaneCTX commented Mar 29, 2022

Addresses the function keyword argument syntax restriction issued in Python 3.8. Code such as f((a) = 2) now raises a syntax error.

@RayaneCTX
Copy link
Author

RayaneCTX commented Mar 30, 2022

This seems to be too much of a hack. This code would not be maintainable as it is. What is a good way of figuring out, in the code, what is the right distance between arglist and atom_paren to assert that the latter is a function argument? Is there another, minimally invasive way of figuring out whether a given atom_paren node is a function argument? Some alternatives:

  1. Define a global boolean and set it totrue while parsing an arglist. Use this boolean to decide whether a given atom_paren should be elided within push_result_node. Pros: small code change - small memory overhead - does not depend on the distance between arglist and atom_paren (i.e. grammar independence). Cons: it's a global variable.
  2. Use a local boolean and expand the push_result_rule call to pass it in. Use this boolean to decide whether a given atom_paren should be elided. Pros: does not depend on the distance between arglist and atom_paren (i.e. grammar independence). Cons: large code change - decent memory overhead (?).
  3. Hardcode the distance between arglist and atom_paren (as is currently done in this PR). Pros: minimal code change - minimal memory overhead. Cons: unmaintainable code.
  4. Some sort of pre-processing of the grammar to generate a macro/constant that equals the distance between arglist and atom_paren (I am not sure what that would look like though).

@jbbjarnason
Copy link

This seems to be too much of a hack. This code would not be maintainable as it is. What is a good way of figuring out, in the code, what is the right distance between arglist and atom_paren to assert that the latter is a function argument? Is there another, minimally invasive way of figuring out whether a given atom_paren node is a function argument? Some alternatives:

  1. Define a global boolean and set it totrue while parsing an arglist. Use this boolean to decide whether a given atom_paren should be elided within push_result_node. Pros: small code change - small memory overhead - does not depend on the distance between arglist and atom_paren (i.e. grammar independence). Cons: it's a global variable.
  2. Use a local boolean and expand the push_result_rule call to pass it in. Use this boolean to decide whether a given atom_paren should be elided. Pros: does not depend on the distance between arglist and atom_paren (i.e. grammar independence). Cons: large code change - decent memory overhead (?).
  3. Hardcode the distance between arglist and atom_paren (as is currently done in this PR). Pros: minimal code change - minimal memory overhead. Cons: unmaintainable code.
  4. Some sort of pre-processing of the grammar to generate a macro/constant that equals the distance between arglist and atom_paren (I am not sure what that would look like though).

So I guess the issue is that you are not certain that the added if stmt in parser is used for other than function arguments?
If so, might it be worth considering placing it in the compile.c , compile_trailer_paren_helper?

@RayaneCTX
Copy link
Author

This seems to be too much of a hack. This code would not be maintainable as it is. What is a good way of figuring out, in the code, what is the right distance between arglist and atom_paren to assert that the latter is a function argument? Is there another, minimally invasive way of figuring out whether a given atom_paren node is a function argument? Some alternatives:

  1. Define a global boolean and set it totrue while parsing an arglist. Use this boolean to decide whether a given atom_paren should be elided within push_result_node. Pros: small code change - small memory overhead - does not depend on the distance between arglist and atom_paren (i.e. grammar independence). Cons: it's a global variable.
  2. Use a local boolean and expand the push_result_rule call to pass it in. Use this boolean to decide whether a given atom_paren should be elided. Pros: does not depend on the distance between arglist and atom_paren (i.e. grammar independence). Cons: large code change - decent memory overhead (?).
  3. Hardcode the distance between arglist and atom_paren (as is currently done in this PR). Pros: minimal code change - minimal memory overhead. Cons: unmaintainable code.
  4. Some sort of pre-processing of the grammar to generate a macro/constant that equals the distance between arglist and atom_paren (I am not sure what that would look like though).

So I guess the issue is that you are not certain that the added if stmt in parser is used for other than function arguments? If so, might it be worth considering placing it in the compile.c , compile_trailer_paren_helper?

I am pretty certain that the if statement will only catch function arguments (I don't think it would be possible for an atom_paren rule to be on the rule stack 15 spots above a arglist rule and not be a function argument). Using option 1 or 2 above would increase that confidence further. There are cases were the removal of the atom_paren rule optimization is pessimistic however. For example, for an input like f((a)), the optimization is turned off by my commit when it could have been enabled. The thing is that, because MP's parser doesn't backtrack once it has consumed tokens, I don't think there is an elegant way around that. Worst case, you'll end up with one extraneous atom_paren rule, so I don't think that's too bad.

My concern is with the way I am currently implementing the change: it is not maintainable, because I hard coded the 15 in, thus creating a strong dependence on the grammar, and one that isn't even made explicit by the code. I need to change this to something that doesn't have this dependence before I make a PR to the main repo.

@RayaneCTX
Copy link
Author

Another option:

  1. Add a boolean field to the parser_t struct or the rule_stack_t struct that keeps track of whether the parser is currently parsing an arglist or not. Pros: low code size change - low memory usage overhead - low code change. Cons: not sure, seems like a good deal actually.

Implemented function keyword argument syntax restriction added with
Python 3.8 (listed under issue micropython#7899). This is done by adding a switch
in the push_result_rule function in parse.c to turn off the parentheses
removal optimization for atom_paren rules encountered while parsing
an arglist rule (meaning this atom_paren rule is a function argument)
and when the enclosed node is a lone ID. Keeping the atom_paren rule
in the parse tree allows the compiler to catch the syntax error when
the rule happens to be a keyword argument.

In order to determine whether an atom_paren rule is a function argument,
a bit is added to the parser structure, which is set in mp_parse when
an arglist rule is being parsed (unset once done with it). Other
options were considered: hard-coding the distance on the rule stack
between an arglist rule and an atom_paren rule in the added switch,
using a global boolean variable, using a local boolean variable (by
expanding the function signature of the push_result_rule function).
Adding a field to the parser structure seemed to provide the best
balance between code size increase, code changes, memory usage
overhead, code maintainability, etc.

Because the compile_atom_paren function did not expect this rule to
be a parent for a lone ID, a switch is added in that function to
handle this.

A new test file fun_kwargs_syntax.py is added to make sure the
syntax restriction is in effect, and a test is added to fun2.py to
make sure extraneous parenthese around ID function arguments does not
cause any issues.

Signed-off-by: Rayane Chatrieux <rayane.chatrieux@gmail.com>
@RayaneCTX RayaneCTX force-pushed the function-keyword-syntax-restriction branch from d380fd5 to 3f2afe9 Compare April 1, 2022 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy