-
Notifications
You must be signed in to change notification settings - Fork 240
Fix!: mark vars referenced in metadata macros as metadata #4936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
eccc76e
to
8d8fc06
Compare
a7d62e1
to
06c31f9
Compare
28157fa
to
a9e829a
Compare
3b1e614
to
aa7ad2c
Compare
2ba3b38
to
16d6260
Compare
@izeigerman thank you for the review– addressed all comments. Planning to merge once CI's green. |
ee50871
to
142dbc2
Compare
@izeigerman since my last comment in this PR, I discovered a bug and refactored my approach to patch it– can you please take another look when you find some bandwidth? The bugGiven the following macros: from sqlmesh import macro
@macro()
def macro1(evaluator):
return evaluator.var("foo")
@macro(metadata_only=True)
def macro2(evaluator, var_value):
return 1 and this model: MODEL (
name test_model,
kind FULL,
);
SELECT
@macro1() AS col,
@macro2(@foo) AS col2 I verified that >>> from sqlmesh import Context
>>> ctx = Context()
>>> ctx.models['"test_model"'].python_env
{'macro1': Executable<payload: def macro1(evaluator):
return evaluator.var('foo'), name: macro1, path: macros/test.py>, 'macro2': Executable<payload: def macro2(evaluator, var_value):
return 1, name: macro2, path: macros/test.py, is_metadata: True>, '__sqlmesh__vars__metadata__': Executable<payload: {'foo': "'id'"}, kind: value, is_metadata: True>} Root causeThe problematic logic was located here. The value of This meant that FixI addressed the above issue in this commit. Below is my analysis that led to this refactor: Terminology
Observations
|
06b78cd
to
cca9524
Compare
... and yet another bug squashed: I realized that only the top-level macro func call matters when it comes to marking variable references under it as metadata or not. Added a test that demonstrates this. |
cca9524
to
67a65eb
Compare
Macro variable references are always treated as non-metadata today. This means that if, for example, a variable is referenced within a metadata-only macro, changing its value will result in a breaking change, which is inconsistent.
This PR alters this behavior, similar to the macro metadata-only status propagation:
audits
property) can be treated as metadata-onlyI intentionally say "can" instead of "will" above, because we need to factor in all references of a variable to decide whether it's a metadata-only reference. The rules implemented here are similar to those we apply for macros: a non-metadata occurrence overrules all metadata occurrences.
Additionally, this PR introduces trimming for blueprint variables. Certain blueprint variables, e.g. used in model names, aren't required after loading, while others are because they may be referenced in the model's statements or in "runtime-rendered" properties (e.g.,
merge_filter
).The former category can be omitted from the model's
python_env
, thus reducing its snapshot's size, as long as a variable is only referenced in the meta block and in fields that are static after loading the model.Both of these changes are quite breaking, so I'm planning to implement a migration script to at least warn about this. I'm also planning to increase the testing coverage. (EDIT: done)