You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm thrilled to announce that [@ast-grep/napi] now supports typed, solving a [long standing issue](https://github.com/ast-grep/ast-grep/issues/48) in our feature request.
8
+
9
+
In this blog post, we will walk through the problem and the [design](https://github.com/ast-grep/ast-grep/issues/1669) of the new feature. It will also be a valuable resource to write a good TypeScript type in general.
4
10
5
-
https://github.com/ast-grep/ast-grep/issues/1669
6
-
https://github.com/ast-grep/ast-grep/issues/48
11
+
## What's type safety? Why?
7
12
8
13
Writing AST manipulation code is hard. Even if we have a lot of helpful interactive tool, it's still hard to handle all edge cases.
9
14
@@ -25,74 +30,201 @@ Designing a good library in the modern JavaScript world is not only about provid
25
30
* Performant: fast to compile. complex types can slow down the compiler
26
31
27
32
28
-
It is really hard to provide a type system that is both Sound and Complete. This is similar to provide a good typing API.
33
+
It is really hard to provide a type system that is both [Sound and Complete](https://logan.tw/posts/2014/11/12/soundness-and-completeness-of-the-type-system/#:~:text=A%20type%2Dsystem%20is%20sound,any%20false%20positive%20%5B2%5D.). This is similar to provide a good typing API.
TS libs nowaday probably pay too much attention to correctness IMHO.
33
37
Having a type to check your path parameter in your routing is cool, but what's the cost?
34
38
35
39
Designing a good TypeScript type is essentially a trade-off of these four aspects.
36
40
37
-
# TreeSitter's types
41
+
##TreeSitter's types
38
42
39
43
Let's come back to ast-grep's problem. ast-grep is based on Tree-Sitter.
40
44
41
-
Tree-Sitter's official API is untyped. It provies a uniform API to access the syntax tree across different languages. A node in Tree-Sitter has several methods like `kind`, `field`, `parent`, `children`, `range`, `text`, etc.
45
+
Tree-Sitter's official API is untyped. It provies a uniform API to access the syntax tree across different languages. A node in Tree-Sitter has several common methods to access its node type, children, parent, and text content.
46
+
47
+
```TypeScript
48
+
classNode {
49
+
kind():string// get the node type
50
+
field(name:string):Node// get a child node by field name
51
+
parent():Node
52
+
children():Node[]
53
+
text():string
54
+
}
55
+
```
56
+
The API is simple and easy to use, but it lacks type information.
42
57
43
-
However, a specific language's syntax tree has a specific structure. For example, a function declaration in JavaScript has a `function` keyword, a name, a list of parameters, and a body. Other AST parser libraries encode this structure in their AST object types. For example, a `function_declaration` has fields like `parameters` and `body`.
58
+
In contrast, a specific language's syntax tree has a specific structure. For example, a function declaration in JavaScript has a `function` keyword, a name, a list of parameters, and a body. Other AST parser libraries encode this structure in their AST object types. For example, a `function_declaration` has fields like `parameters` and `body`.
44
59
45
60
Fortunately tree-sitter provides static node types in json.
46
61
There are several challenges to generate TypeScript types from tree-sitter's static node types.
47
62
48
63
1. json is hosted by parser library repo
49
-
We needs type generation (it is like layman's type provider)
64
+
We needs type generation (it is like F-sharp's type provider)
50
65
2. json contains a lot unnamed kinds
51
66
You are writing a compiler plugin, not elementary school math homework
52
67
3. json has alias type
53
-
68
+
For example, `declaration` is an alias of `function_declaration`, `class_declaration` and other declaration kinds.
54
69
55
70
## Design Type
56
71
57
-
1. lenient type check if no information
58
-
2. more strict checking if refined
72
+
The design principle of the new API is to progressively provide a more strict code checking and completion when the user gives more type information.
73
+
74
+
1.**Allow untyped AST access if no type information is provided**
75
+
76
+
Existing untyped API is still available and it is the default behavior.
77
+
The new feature should not break the existing code.
78
+
79
+
2.**Allow user to type AST node and enjoy more type safety**
80
+
81
+
The user can give types to AST nodes either manually or automatically.
82
+
Both approaches should refine the general untyped AST nodes to typed AST nodes and bring type check and intelligent completion to the user.
59
83
60
84
61
85
## Define Type
62
86
63
-
### Prune unnamed kinds
64
-
For example `+`/`-`/`*`/`/` is too noisy for a general AST library
87
+
## TreeSitter's `TypeMap`
88
+
The new typed API will consume TreeSitte's [static node types](https://tree-sitter.github.io/tree-sitter/using-parsers#static-node-types) like below:
89
+
90
+
```typescript
91
+
interfaceTypeMpa {
92
+
[kind:string]: {
93
+
type:string
94
+
named:boolean
95
+
fields?: {
96
+
[field:string]: {
97
+
types: { type:string, named:boolean }[]
98
+
}
99
+
}
100
+
subtypes?: { type:string, named:boolean }[]
101
+
}
102
+
}
103
+
```
104
+
What is `TypeMaps`? It is a type that contains all static node types. It is a map from kind to the static type of the kind.
105
+
106
+
```typescript
107
+
typeTypeScriptMap= {
108
+
// AST node type definition
109
+
function_declaration: {
110
+
type:"function_declaration", // kind
111
+
named:true, // is named
112
+
fields: {
113
+
body: {
114
+
types: [ { type:"statement_block", named:true } ]
115
+
},
116
+
...
117
+
}
118
+
},
119
+
// node type alias
120
+
declaration: {
121
+
type:"declaration",
122
+
subtypes: [
123
+
{ type:"class_declaration", named:true },
124
+
{ type:"function_declaration", named:true },
125
+
]
126
+
},
127
+
...
128
+
}
129
+
```
130
+
131
+
The type information is encoded in a JSON object. Syntax node's static type contains the kind, whether it is named, and the fields of the node.
132
+
`fields` is a map from field name to the type of the field, which encodes the structure of the AST like other parser libraries.
133
+
134
+
Tree-sitter also provides alias types where a kind is an alias of a list of other kinds. For example, `declaration` is an alias of `function_declaration`, `class_declaration` and other kinds. The alias type is used to reduce the number of kinds in the static type.
135
+
136
+
We want to both type a node's kind and its fields.
137
+
138
+
## `SgNode<K>`
139
+
140
+
`SgNode<M, K>` is the main type in the new API. It is a generic type that represents a node with kind `K`. It is a union of all possible kinds of nodes.
141
+
142
+
```typescript
143
+
classSgNode<MextendsTypesMap, KextendskeyofM> {
144
+
kind:K
145
+
fields:M[K]['fields'] // for simplicity
146
+
}
147
+
```
148
+
149
+
65
150
66
151
### `ResolveType<M, T>`
67
152
68
-
Use type script to resolve type alias
153
+
TreeSitter's type alias is helpful to reduce the generated JSON file size but it is not useful to users because the alias is never directly used as a node's kind nor is used as `kind` in ast-grep rule.
154
+
155
+
We need to use a type alias to resolve the alias type to its concrete type.
1. string literal completion with `LowPriorityString`
73
-
2. lenient
166
+
Having a collection of possible AST node kinds is awesome, but it is sometime too clumsy to use a big string literal union type.
167
+
Also, TreeSitter's static type contains a lot of unnamed kinds, which are not useful to users. Including them in the union type is too noisy. We need to allow users to opt-in to use the kind, and fallback to a plain `string` type.
168
+
169
+
```typescript
170
+
typeKinds<M> =keyofM&LowPriorityString
171
+
typeLowPriorityString=string& {}
172
+
```
173
+
174
+
The above type is a linient string type that is compatible with any string type. But it also uses a well-known trick to take advantage of TypeScript's type priority to prefer the `keyofM` type in completion over the `string& {}` type. To make it more self-explanatory, the `stirng& {}` type is aliased to `LowPriorityString`.
74
175
75
176
Problem? open-ended union is not well supported in TypeScript
KextendskeyofM?SgNode<M, K> :never// this conditional type unpack the string union to Node union
211
+
```
212
+
it is like biome / rowan's API where you can refine the node to a specific kind.
86
213
87
214
88
215
## Refine Type
89
216
217
+
Now let's talk about how to refine the general node to a specific node in ast-grep/napi
218
+
90
219
### Refine Node, Manually
91
220
92
-
1.via `sgNode.find<"KIND">`
93
-
2.via `sgNode.is<"KIND">`, One time type narrowing
221
+
Most AST traversal methods in ast-grep now can take a new type parameter to refine the node to a specific kind.
94
222
95
-
Using the intersting overloading feature of TypeScript
223
+
This is like the `document.querySelector<T>` method in the DOM API. It returns a general `Element` type, but you can refine it to a specific type like `HTMLDivElement` by providing generic argument.
224
+
225
+
For example `sgNode.find<"KIND">()`
226
+
227
+
This uses the intersting overloading feature of TypeScript
96
228
97
229
```typescript
98
230
interfaceNodeMethod<K> {
@@ -101,14 +233,37 @@ interface NodeMethod<K> {
101
233
}
102
234
```
103
235
236
+
If no type is provided, it returns a general node. If a type is provided, it returns a specific node.
237
+
238
+
another way to do runtime checking is via `sgNode.is("kind")`, One time type narrowing
239
+
240
+
```typescript
241
+
if (sgNode.is("function_declaration")) {
242
+
sgNode.kind// narrow to 'function_declaration'
243
+
}
244
+
```
245
+
104
246
### Refine Node, Automatically
105
247
106
-
`sgNode.field("kind")` will
248
+
The key feature of the new API is to automatically refine the node to a specific kind when the user gives more type information.
249
+
250
+
This is done by using the `field` method
251
+
252
+
`sgNode.field("kind")` will automatically check the field name and its corresponding types in the static type, and refine the node to the specific kind.
107
253
108
254
109
255
### Exhaustive Checking via `sgNode.kindToRefine`
110
256
111
-
Only available for node with specific kinds
257
+
Why do we need the `kindToRefine` property given that we already have a `kind()` method?
258
+
259
+
TypeScript cannot narrow type via a method call. It can only narrow type via a property access.
260
+
261
+
Also `kindToRefine` is a getter under the hood powered by napi. It is less efficient thant JavaScript's object property access.
262
+
Actually, it will call Rust function from JavaScript, which is as expensive as the `kind()` method.
263
+
264
+
To bring user's awareness to this performance implication and to make a backward compatible API change, we introduce the `kindToRefine` property.
265
+
266
+
It is mostly useful for a union type of nodes with specific kinds
For example `+`/`-`/`*`/`/` is too noisy for a general AST library
289
+
290
+
This is also the reason why we need to include `string` in the `Kinds`.
291
+
292
+
### Opt-in refinement for better compile time performance
293
+
294
+
The new API is designed to provide a better type checking and completion experience to the user. But it comes with a cost of performance. The more type information the user provides, the slower the compile time.
295
+
296
+
```typescript
297
+
import { parse } from'@ast-grep/napi'
298
+
importTSfrom'@ast-grep/napi/lang/TypeScript'
299
+
const untyped =parse(Lang.TypeScript, code)
300
+
const typed =parse<TS>(Lang.TypeScript, code)
301
+
```
302
+
130
303
### Typed Rule!
131
304
305
+
The last feature worth mentioning is the typed rule! You can even type the `kind` in rule JSON!
306
+
132
307
```typescript
133
308
sgNode.find({
134
309
rule: {
@@ -138,15 +313,11 @@ sgNode.find({
138
313
})
139
314
```
140
315
141
-
142
-
### Opt-in refinement for better compile time performance
143
-
144
316
## Ending
145
317
146
318
I'm very thrilled to see the future of AST manipulation in TypeScript.
147
319
This feature enables users to switch freely between untyped AST and typed AST.
148
320
149
-
it is like biome / rowan
150
321
151
322
https://x.com/hd_nvim/status/1868453729940500924
152
323
There are very few devs that understands Rust deeply enough and compiler deeply enough that also care about TypeScript in web dev enough to build something for web devs in Rust
0 commit comments