-
Notifications
You must be signed in to change notification settings - Fork 1.7k
JS: QL-side type/name resolution for TypeScript and JSDoc #19078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cee75ae
to
fa3e5ed
Compare
fe9b23d
to
c07cc6e
Compare
2d928f2
to
3b395af
Compare
3b395af
to
d92247c
Compare
45b09df
to
ae0aeb9
Compare
…nt declarations This test enforced the opinion that ambient declarations should have no impact on data flow, which is no longer the case. For now I'm just updating the test output.
aab67e7
to
2b208d6
Compare
2b208d6
to
d644f80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extraordinary... 👏
Co-authored-by: Napalys Klicius <napalys@github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still haven't looked closely at the three big commits, but here are some small comments.
) | ||
or | ||
exists(ImportTypeExpr imprt | | ||
node1 = imprt.getPathExpr() and // TODO: ImportTypeExpr does not seem to be resolved to a Module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning to leave this TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at this time. Since its path string is not an Expr
it requires a bit more refactoring.
javascript/ql/lib/semmle/javascript/internal/NameResolution.qll
Outdated
Show resolved
Hide resolved
predicate step(Node node1, Node node2) { | ||
commonStep(node1, node2) | ||
or | ||
specificStep(node1, node2) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like you made commonStep
and specificStep
at a point where the code looked different.
Because both of those predicates are only ever used here, commonStep
is not reused anywhere else.
Additionally the specificStep
predicate has a QLDoc that mentions it being part of a configuration.
But it's not part of a configuration, it's part of a parameterized module that can be instantiated with a config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought it would add clarity to separate them commonStep
and specificStep
. It also helps make clear that commonStep
can be materialised once and used twice (once per instantiation of the module).
commonStep is not reused anywhere else.
It is used in two different instantiations of the module. We could inline it in the module, but then again, putting it inside a parameterised module where it does not depend on any of the module parameters is also confusing in its own way.
Additionally the specificStep predicate has a QLDoc that mentions it being part of a configuration.
I thought it would be clear that we're talking about the configuration that was passed to the module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, now I see. Yeah, I was wrong.
The code for specificStep
is constant, but it depends on predicates defined by the module-parameter.
And you can't move that code to commonStep
, because materializing all of that would blow up (because specificStep
follows store-read pairs, but commonStep
does not).
Is that about right?
Because if it doesn't blow up without the specialization used by specificStep
, couldn't you then just compute all possible steps outside this module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code for specificStep is constant, but it depends on predicates defined by the module-parameter.
I'm not sure what this means? It depends on the module parameter, both directly via S::isRelevantVariable
and indirectly via its dependency on getModuleExports
which also depends on the module parameter.
And you can't move that code to commonStep, because materializing all of that would blow up
It's not meant a performance thing at all; it's about separating types from values.
This example illustrates the need to keep them as two different graphs, because both graphs need to step through X
without forgetting if a value or type is being tracked:
// lib.ts
class C1 {}
class C2 {}
const X = C1;
type X = C2;
export { X }
// use.ts
import { X } from "./lib"
var x1 = X // should refer to C1
var x2: X; // should refer to C2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, OK, now I think I get it.
Could you add a comment, e.g. on FlowImpl
that the module is specialized to work on only types or only on values, and that is needed to keep the separation you just explained.
commonStep(...)
includes steps for both values and types, which might have caused some of my confusion.
Maybe add to the doc for that predicate that it contains type-specific steps, but that those when applied to values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment to FlowImpl
and expanded on the comment for commonStep
Node trackFunctionType(Function fun) { | ||
result = fun | ||
or | ||
exists(Node mid | mid = trackFunctionType(fun) | | ||
TypeFlow::step(mid, result) | ||
or | ||
UnderlyingTypes::underlyingTypeStep(mid, result) | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you're using a specialized predicate (instead of TypeFlow::TrackNode
) because you additionally need the steps from UnderlyingTypes::underlyingTypeStep
.
Why can't those steps be part of TypeFlow::TrackNode
? Or why can't functions use TypeFlow::TrackNode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I've renamed the predicate and added a clarifying qldoc comment in 18f9133
Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a few minor comments from me. I think this looks really solid!
/** | ||
* Holds if values/namespaces/types in `node1` can flow to values/namespaces/types in `node2`. | ||
*/ | ||
private predicate commonStep(Node node1, Node node2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: Could we rename node1
and node2
to, say, source
and target
(or src
and tgt
)? It would make it slightly less likely that a typo messed things up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not. In fact I'd say we should standardise on node1,node2
everywhere we define edges in a graph.
This is the convention used in the data flow library, and it generally works out better when you add in things like state1
and state2
.
private string normalizeModuleName(string name) { | ||
result = | ||
name.regexpReplaceAll("^node:", "") | ||
.regexpReplaceAll("\\.[jt]sx?$", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also account for things like .cjs
and .mjs
?
Prepares for disabling TypeScript type extraction by recovering the relevant type information on the QL side.
Actually disabling type extraction will happen in a separate PR so it's easier to validate in isolation and rollback if needed.
The code is divided into three main components:
(Request & { x: T }) | null
hasRequest
as an underlying type.One of the complexities of TypeScript is the fact that there are three "declaration spaces" in which variables can exist: value, type, and namespace. For example, for a declaration like
class C {}
, the valueC
refers to the class itself (i.e. its constructor), whereas the typeC
describes an instance of the class (not the class itself). In practice, it seems that values and namespaces can be merged without problems so I decided to simplify things by doing that.Effects on call graph
main
would lose us 40k call edges on the default benchmark suite..js
files. The new solution has unified support for the two kinds of type annotations, which was not possible previously since we didn't want to run the TypeScript compiler on.js
files.KnownClass | UnresolvedClass
will always propagate information aboutKnownClass
even ifUnresolvedClass
could not be resolved.