Skip to content

Commit 74c2422

Browse files
authored
Update DSL docs for cases generator (#105753)
* Clarify things around goto error/ERROR_IF a bit * Remove docs for super-instructions * Add pseudo; fix heading markup
1 parent 1d857da commit 74c2422

File tree

1 file changed

+60
-44
lines changed

1 file changed

+60
-44
lines changed

Tools/cases_generator/interpreter_definition.md

Lines changed: 60 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -67,26 +67,24 @@ parts of instructions, we can reduce the potential for errors considerably.
6767

6868
## Specification
6969

70-
This specification is at an early stage and is likely to change considerably.
70+
This specification is a work in progress.
71+
We update it as the need arises.
7172

72-
Syntax
73-
------
73+
### Syntax
7474

7575
Each op definition has a kind, a name, a stack and instruction stream effect,
7676
and a piece of C code describing its semantics::
7777

7878
```
7979
file:
80-
(definition | family)+
80+
(definition | family | pseudo)+
8181
8282
definition:
8383
"inst" "(" NAME ["," stack_effect] ")" "{" C-code "}"
8484
|
8585
"op" "(" NAME "," stack_effect ")" "{" C-code "}"
8686
|
8787
"macro" "(" NAME ")" "=" uop ("+" uop)* ";"
88-
|
89-
"super" "(" NAME ")" "=" NAME ("+" NAME)* ";"
9088
9189
stack_effect:
9290
"(" [inputs] "--" [outputs] ")"
@@ -122,16 +120,17 @@ and a piece of C code describing its semantics::
122120
object "[" C-expression "]"
123121
124122
family:
125-
"family" "(" NAME ")" = "{" NAME ("," NAME)+ "}" ";"
123+
"family" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";"
124+
125+
pseudo:
126+
"pseudo" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";"
126127
```
127128

128129
The following definitions may occur:
129130

130131
* `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`.
131132
* `op`: A part instruction from which macros can be constructed.
132133
* `macro`: A bytecode instruction constructed from ops and cache effects.
133-
* `super`: A super-instruction, such as `LOAD_FAST__LOAD_FAST`, constructed from
134-
normal or macro instructions.
135134

136135
`NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword.
137136
`foo_1` is legal. `$` is not legal, nor is `struct` or `class`.
@@ -159,15 +158,21 @@ By convention cache effects (`stream`) must precede the input effects.
159158

160159
The name `oparg` is pre-defined as a 32 bit value fetched from the instruction stream.
161160

161+
### Special functions/macros
162+
162163
The C code may include special functions that are understood by the tools as
163164
part of the DSL.
164165

165166
Those functions include:
166167

167168
* `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met.
168-
* `ERROR_IF(cond, label)`. Jump to error handler if `cond` is true.
169+
* `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true.
169170
* `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects.
170171

172+
Note that the use of `DECREF_INPUTS()` is optional -- manual calls
173+
to `Py_DECREF()` or other approaches are also acceptable
174+
(e.g. calling an API that "steals" a reference).
175+
171176
Variables can either be defined in the input, output, or in the C code.
172177
Variables defined in the input may not be assigned in the C code.
173178
If an `ERROR_IF` occurs, all values will be removed from the stack;
@@ -187,17 +192,39 @@ These requirements result in the following constraints on the use of
187192
intermediate results.)
188193
3. No `DEOPT_IF` may follow an `ERROR_IF` in the same block.
189194

190-
Semantics
191-
---------
195+
(There is some wiggle room: these rules apply to dynamic code paths,
196+
not to static occurrences in the source code.)
197+
198+
If code detects an error condition before the first `DECREF` of an input,
199+
two idioms are valid:
200+
201+
- Use `goto error`.
202+
- Use a block containing the appropriate `DECREF` calls ending in
203+
`ERROR_IF(true, error)`.
204+
205+
An example of the latter would be:
206+
```cc
207+
res = PyObject_Add(left, right);
208+
if (res == NULL) {
209+
DECREF_INPUTS();
210+
ERROR_IF(true, error);
211+
}
212+
```
213+
214+
### Semantics
192215
193216
The underlying execution model is a stack machine.
194217
Operations pop values from the stack, and push values to the stack.
195218
They also can look at, and consume, values from the instruction stream.
196219
197-
All members of a family must have the same stack and instruction stream effect.
220+
All members of a family
221+
(which represents a specializable instruction and its specializations)
222+
must have the same stack and instruction stream effect.
223+
224+
The same is true for all members of a pseudo instruction
225+
(which is mapped by the bytecode compiler to one of its members).
198226
199-
Examples
200-
--------
227+
## Examples
201228
202229
(Another source of examples can be found in the [tests](test_generator.py).)
203230
@@ -237,27 +264,6 @@ This would generate:
237264
}
238265
```
239266
240-
### Super-instruction definition
241-
242-
```C
243-
super ( LOAD_FAST__LOAD_FAST ) = LOAD_FAST + LOAD_FAST ;
244-
```
245-
This might get translated into the following:
246-
```C
247-
TARGET(LOAD_FAST__LOAD_FAST) {
248-
PyObject *value;
249-
value = frame->f_localsplus[oparg];
250-
Py_INCREF(value);
251-
PUSH(value);
252-
NEXTOPARG();
253-
next_instr++;
254-
value = frame->f_localsplus[oparg];
255-
Py_INCREF(value);
256-
PUSH(value);
257-
DISPATCH();
258-
}
259-
```
260-
261267
### Input stack effect and cache effect
262268
```C
263269
op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) {
@@ -339,14 +345,26 @@ For explanations see "Generating the interpreter" below.)
339345
}
340346
```
341347

342-
### Define an instruction family
343-
These opcodes all share the same instruction format):
348+
### Defining an instruction family
349+
350+
A _family_ represents a specializable instruction and its specializations.
351+
352+
Example: These opcodes all share the same instruction format):
353+
```C
354+
family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT };
355+
```
356+
357+
### Defining a pseudo instruction
358+
359+
A _pseudo instruction_ is used by the bytecode compiler to represent a set of possible concrete instructions.
360+
361+
Example: `JUMP` may expand to `JUMP_FORWARD` or `JUMP_BACKWARD`:
344362
```C
345-
family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT } ;
363+
pseudo(JUMP) = { JUMP_FORWARD, JUMP_BACKWARD };
346364
```
347365

348-
Generating the interpreter
349-
==========================
366+
367+
## Generating the interpreter
350368

351369
The generated C code for a single instruction includes a preamble and dispatch at the end
352370
which can be easily inserted. What is more complex is ensuring the correct stack effects
@@ -401,9 +419,7 @@ rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something
401419
}
402420
```
403421

404-
Other tools
405-
===========
422+
## Other tools
406423

407424
From the instruction definitions we can generate the stack marking code used in `frame.set_lineno()`,
408425
and the tables for use by disassemblers.
409-

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy