OJSNAD Doc
OJSNAD Doc
Chapter 1: Overview
Chapter Overview
In Node.js the module is a unit of code. Code should be divided up into modules and then composed
together in other modules. Packages expose modules, modules expose functionality. But in Node.js a file
can be a module as well, so libraries are also modules. In this chapter we'll learn how to create and load
modules. We'll also be taking a cursory look at the difference between language-native EcmaScript
Modules (ESM) and the CommonJS (CJS) module system that Node used (and still uses) prior to the
introduction of the EcmaScript Module system into JavaScript itself.
Chapter 2: Setting Up
Node can also be installed directly from the Node.js website. Again on
macOS and Linux, it predicates the use of sudo for installing global libraries.
Whether Windows, macOS or Linux, in the following sections we'll present a
better way to install Node using a version manager.
It's strongly recommended that if Node is installed via an Operating System
Package Manager or directly via the website, that it be completely
uninstalled before proceeding to the following sections.
The current nvm version is v0.39.5 (as of November 2023), so the install
process will contain this version in the URL, if a greater version is out at time
of reading, replace v0.39.5 with the current nvm version. For this installation
process we assume that Bash, Sh, or Zsh is the shell being used, Fish is not
supported but see the nvm README for alternatives.
curl -o-
htt
ps://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh |
bash
If using zsh (e.g., on newer macOS releases) the bash part of the command
can be replaced with zsh.
Alternatively, the file can be downloaded and saved, and then easily
executed like so:
Again bash can be replaced with zsh. To check that the installation was
successful execute the following in the terminal:
command -v nvm
It should output nvm. If this fails on Linux, close and reopen the terminal (or
SSH session) and try running the command again. On macOS see GitHub for
in-depth troubleshooting instructions.
Now that we have a version manager, let's install the Node version we'll be
using in this course:
nvm install 20
In this case, the command installed Node v20.9.0. It doesn't matter if the
right-most numbers are higher for this course, as long as the major number
(the first number) is 20.
We can verify that Node is installed, and which version, with the following
command:
node -v
We now have the right setup on our macOS or Linux machine to proceed
with the course.
Once installed run the following to install the latest version 20 release:
nvs add 20
Then execute the following to select the newly installed node version:
nvs use 20
Use node -v to confirm the installed version. In this case, the command
installed Node v20.9.0. It doesn't matter if the right-most numbers are higher
for this course, as long as the major number (the first number) is 20. If these
steps have been completed, congratulations! You now have the right setup
on your Windows machine to proceed with the course.
If a later release than v1.7.0 is available, download the MSI for that release.
Once downloaded, run the installer and follow the steps to install. After it's
installed, open a cmd.exe or powershell prompt and run the following:
nvs add 20
To activate the newly installed version, we also need to run the following
command:
nvs use 20
node -v
We now have the right setup on our Windows machine to proceed with the
course.
Chapter 3: Node Binary
Chapter Overview
The Node.js platform is almost entirely represented by the node binary
executable. In order to execute a JavaScript program we use: node app.js,
where app.js is the program we wish to run. However, before we start
running programs, let’s explore some of the command line flags offered by
the Node binary.
Beyond the Node command line flags there are additional flags for modifying
the JavaScript runtime engine: V8. To view these flags run node --v8-
options.
Checking Syntax
It’s possible to parse a JavaScript application without running it in order to
just check the syntax.
node -c app.js
If the code parses successfully, there will be no output. If the code does not
parse and there is a syntax error, the error will be printed to the terminal.
Dynamic Evaluation
Node can directly evaluate code from the shell. This is useful for quickly
checking a code snippet or for creating very small cross-platform commands
that use JavaScript and Node core API’s.
There are two flags that can evaluate code. The -p or --print flag evaluates
an expression and prints the result, the -e or --eval flag evaluates without
printing the result of the expression.
The following will not print anything because the expression is evaluated but
not printed:
The following will print 2 because console.log is used to explicitly write the
result of 1+1 to the terminal:
node -e "console.log(1+1)"
When used with print flag the same will print 2 and then
print undefined because console.log returns undefined; so the result of the
expression is undefined:
node -p "console.log(1+1)"
Usually a module would be required, like so: require('fs'), however all
Node core modules can be accessed by their namespaces within the code
evaluation context.
For example, the following would print all the files with a .js extension in the
current working directory in which the command is run:
In Chapter 7, we'll be covering the two module systems that Node uses,
CommonJS and ESM, but it's important to note here that the --require flag
can only preload a CommonJS module, not an ESM module. ESM modules
have a vaguely related flag, called --loader, a currently experimental flag
which should not be confused with the --require preloader flag. For more
information on the --loader flag see Node.js documentation.
The stack trace limit can be modified with the --stack-trace-limit flag.
This flag is part of the JavaScript runtime engine, V8, and can be found in the
output of the --v8-options flag.
Consider a program named app.js containing the following code:
function f (n = 99) {
if (n === 0) throw Error()
f(n - 1)
}
f()
When executed, the function f will be called 100 times. On the 100th time,
an Error is thrown and the stack for the error will be output to the console.
The stack trace output only shows the call to the f function, in order to see
the very first call to f the stack trace limit must be set to 101. This can be
achieved with the following:
Chapter Overview
In order to debug an application, the Node.js process must be started in
Inspect mode. Inspect puts the process into a debuggable state and exposes
a remote protocol, which can be connected to via debugger such as Chrome
Devtools. In addition to debugging capabilities, Inspect Mode also grants the
ability to run other diagnostic checks on a Node.js process. In this chapter,
we'll explore how to debug and profile a Node.js process.
function f (n = 99) {
if (n === 0) throw Error()
f(n - 1)
}
f()
When using the --inspect or --inspect-brk flags Node will output some
details to the terminal:
In order to begin debugging the process, the next step is to set a Chrome
browser tab's address bar to chrome://inspect.
From here all the usual Chrome Devtools functionality can be used to debug
the process. For more information on using Chrome Devtools, see Google
Developer's Documentation.
There are a range of other tools that can be used to debug a Node.js process
using Chrome Devtools remote debugging protocol. To learn more,
read "Debugging Guide" by nodejs.org.
Sometimes a program will throw in far less obvious ways. In these scenarios,
the "Pause on exceptions" feature can be a useful tool for locating the source
of an exception.
The debugger statement can be used to explicitly pause on the line that the
statement appears when debugging.
function f (n = 99) {
if (n === 0) throw Error()
debugger
f(n - 1)
}
f()
This time, start app.js in Inspect mode, that is with the --inspect flag
instead of the -inspect-brk flag. Once Chrome Devtools is connected to the
inspector, the "Sources" tab of Devtools will show that the application is
paused on line 3:
Using the debugger statement is particularly useful when the line we wish to
break at is buried somewhere in a dependency tree: in a function that exists
in a required module of a required module of a required module and so on.
When not debugging, these debugger statements are ignored, however due
to noise and potential performance impact it is not good practice to
leave debugger statements in code.
Chapter 5: Key JavaScript Concepts
Data Types
JavaScript is a loosely typed dynamic language. In JavaScript there are seven
primitive types. Everything else, including functions and arrays, is an object.
Null: null
Undefined: undefined
Number: 1, 1.5, -1e4, NaN
BigInt: 1n, 9007199254740993n
String: 'str', "str", `str ${var}`
Boolean: true, false
Symbol: Symbol('description'), Symbol.for('namespace')
Functions
Functions are first class citizens in JavaScript. A function is an object, and
therefore a value that can be used like any other value.
function factory () {
return function doSomething () {}
}
It's crucial to understand that this refers to the object on which the function
was called, not the object which the function was assigned to:
Functions have a call method that can be used to set their this context:
In this case the fn function wasn't assigned to any of the objects, this was
set dynamically via the call function.
There are also fat arrow functions, also known as lambda functions:
When defined without curly braces, the expression following the fat arrow
(=>) is the return value of the function. Lambda functions do not have their
own this context, when this is referenced inside a function, it refers to
the this of the nearest parent non-lambda function.
function fn() {
return (offset) => {
console.log(this.id + offset)
}
}
const obj = { id: 999 }
const offsetter = fn.call(obj)
offsetter(1) // prints 1000 (999 + 1)
function normalFunction () { }
const fatArrowFunction = () => {}
console.log(typeof normalFunction.prototype) // prints 'object'
console.log(typeof fatArrowFunction.prototype) // prints
'undefined'
Prototypal Inheritance
(Constructor Functions)
Creating an object with a specific prototype object can also be achieved by
calling a function with the new keyword. In legacy code bases this is a very
common pattern, so it's worth understanding.
Wolf.prototype.howl = function () {
console.log(this.name + ': awoooooooo')
}
Dog.prototype = inherit(Wolf.prototype)
Dog.prototype.woof = function () {
console.log(this.name + ': woof')
}
The Wolf and Dog functions have capitalized first letters. Using PascaleCase
for functions that are intended to be called with new is convention and
recommended.
When new Dog('Rufus') is called a new object is created (rufus). That new
object is also the this object within the Dog constructor function.
The Dog constructor function passes this to Wolf.call.
Using the call method on a function allows the this object of the function
being called to be set via the first argument passed to call. So when this is
passed to Wolf.call, the newly created object (which is ultimately assigned
to rufus) is also referenced via the this object inside the Wolf constructor
function. All subsequent arguments passed to call become the function
arguments, so the name argument passed to Wolf is "Rufus the dog".
The Wolf constructor sets this.name to "Rufus the dog", which means that
ultimately rufus.name is set to "Rufus the dog".
In legacy code bases, creating a prototype chain between Dog and Wolf for
the purposes of inheritance may be performed many different ways. There
was no standard or native approach to this before EcmaScript 5.
Dog.prototype = Object.create(Wolf.prototype)
Dog.prototype.woof = function () {
console.log(this.name + ': woof')
}
Dog.prototype.woof = function () {
console.log(this.name + ': woof')
}
util.inherits(Dog, Wolf)
Object.setPrototypeOf(Dog.prototype, Wolf.prototype)
The class syntax sugar does reduce boilerplate when creating a prototype
chain:
class Wolf {
constructor (name) {
this.name = name
}
howl () { console.log(this.name + ': awoooooooo') }
}
This will setup the same prototype chain as in the Functional Prototypal
Inheritance and the Function Constructors Prototypal Inheritance examples:
console.log(Object.getPrototypeOf(rufus) === Dog.prototype)
//true
console.log(Object.getPrototypeOf(Dog.prototype) ===
Wolf.prototype) //true
The super keyword in the Dog class constructor method is a generic way to
call the parent class constructor while setting the this keyword to the
current instance. In the Constructor Function example Wolf.call(this,
name + ' the dog') is equivalent to super(name + ' the dog') here.
Any methods other than constructor that are defined in the class are
added to the prototype object of the function that the class syntax creates.
class Wolf {
constructor (name) {
this.name = name
}
howl () { console.log(this.name + ': awoooooooo') }
}
The class syntax based approach is the most recent addition to JavaScript
when it comes to creating prototype chains, but is already widely used.
Closure Scope
When a function is created, an invisible object is also created, this is known
as the closure scope. Parameters and variables created in the function are
stored on this invisible object.
When a function is inside another function, it can access both its own closure
scope, and the parent closure scope of the outer function:
function outerFn () {
var foo = true
function print() { console.log(foo) }
print() // prints true
foo = false
print() // prints false
}
outerFn()
The outer variable is accessed when the inner function is invoked, this is why
the second print call outputs false after foo is updated to false.
If there is a naming collision then the reference to the nearest closure scope
takes precedence:
function outerFn () {
var foo = true
function print(foo) { console.log(foo) }
print(1) // prints 1
foo = false
print(2) // prints 2
}
outerFn()
In this case the foo parameter of print overrides the foo variable in
the outerFn function.
Closure scope cannot be accessed outside of a function:
function outerFn () {
var foo = true
}
outerFn()
console.log(foo) // will throw a ReferenceError
The init function is called twice, and the resulting function is assigned
to createUser and createBook. These two functions have access to two
separate instances of the init functions closure scope.
The dave and annie objects are instantiated by calling createUser.
In the example all the state is returned from the returned function, but this
pattern can be used for much more than that. For instance, the init function
could provide validation on type, return different functions depending on
what type is.
The three dots (...) in the return statement of dog is called the spread
operator. The spread operator copies the properties from the object it
proceeds into the object being created.
The wolf function returns an object with a howl function assigned to it. That
object is then spread (using …) into the object returned from
the dog function, so howl is copied into the object. The object returned from
the dog function also has a woof function assigned.
The npm help command will print out a list of available commands:
A quick help output for a particular command can be viewed using the -
h flag with that command:
npm install -h
Initializing a Package
A package is a folder with a package.json file in it (and then some code). A
Node.js application or service is also a package, so this could equally be
titled "Initializing an App" or "Initializing a Service" or generically, "Initializing
a Node.js Project".
For this example a new folder called my-package is used, every command in
this section is executed with the my-package folder as the current working
directory.
Running npm init will start a CLI wizard that will ask some questions:
For our purposes we can hit return for every one of the questions.
A shorter way to accept the default value for every question is to use the -
y flag:
The default fields in a generated package.json are:
Once the dependency is installed the package.json file will have the
following content:
{
"name": "my-package",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"pino": "^8.14.1"
}
}
Running the npm install command has modified the package.json file by
adding a "dependencies" field:
"dependencies": {
"pino": "^8.14.1"
}
The "dependencies" field contains an object, the keys of the object contain
dependency namespaces, the values in the object contain the Semver range
version number for that dependency. We will explore the Semver format
later in this chapter.
Running npm install pino without specifying a version will install the latest
version of the package, so the version number may vary when following
these steps. If the installed version number doesn't match up, this is fine as
long as the major number (the first number) is 8. If a new major release
of pino is available, we can instead execute npm install pino@8 to ensure
we're using the same major version.
The node_modules folder contains the logger package, along with all the
packages in its dependency tree:
The npm install command uses a maximally flat strategy where all
packages in a dependency tree placed at the top level of
the node_modules folder unless there are two different versions of the same
package in the dependency tree, in which case the packages may be stored
in a nested node_modules folder.
If we run npm ls, it won't print out the same tree any more because the
dependency isn't installed, but it will warn that the dependency should be
installed:
Running npm ls now will show that the logger has been installed again:
The node_modules folder should not be checked into git,
the package.json should be the source of truth.
Development Dependencies
Running npm install without any flags will automatically save the
dependency to the package.json file's "dependencies" field. Not all
dependencies are required for production, some are tools to support the
development process. These types of dependencies are called development
dependencies.
npm ls --depth=999
Notice how the atomic-sleep sub-dependency occurs twice in the output.
The second occurrence has the word deduped next to it. The atomic-
sleep module is a dependency of both pino and its direct
dependency sonic-boom, but both pino and sonic-boom rely on the same
version of atomic-sleep. Which allows npm to place a single atomic-
sleep package in the node_modules folder.
{
"name": "my-package",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"pino": "^8.14.1"
},
"devDependencies": {
"standard": "^17.0.0"
}
}
Node is being used here to remove the node_modules folder because this
command is platform independent, but we can use any approach to remove
the folder as desired.
Now let's run npm install with the --omit=dev flag set:
npm ls --depth=999
The error message is something of a misdirect, the development
dependency is deliberately omitted in this scenario.
Understanding Semver
Let's look at the dependencies in the package.json file:
"dependencies": {
"pino": "^8.14.1"
},
"devDependencies": {
"standard": "^17.0.0"
}
We've installed two dependencies, pino at a Semver range of ^8.14.1
and standard at a SemVer range of ^17.0.0. Our package version number is
the SemVer version 1.0.0. There is a distinction between the SemVer format
and a SemVer range.
Select each box to learn more about these different types of changes.
Understanding Semver
Expand MAJOR
Expand MINOR
Expand PATCH
This is the core of the SemVer format, but there are extensions which won't
be covered here, for more information on SemVer see SemVer's website.
A SemVer range allows for a flexible versioning strategy. There are many
ways to define a SemVer range.
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1",
"lint": "standard"
},
'use strict';
console.log('my-package started');
process.stdin.resume();
Let's make sure all dependencies are installed before we try out
the "lint" script by running.
npm install
The package.json already has a "test" field, let's run npm test:
The "test" field in package.json scripts is as follows:
Note that we did not have to use npm run test, the npm test command is
an alias for npm run test. This aliasing only applies to test and start.
Our npm run lint command cannot be executed using npm lint for
example.
Let's add one more script, a "start" script, edit the package.json scripts
field to match the following:
"scripts": {
"start": "node index.js",
"test": "echo \"Error: no test specified\" && exit 1",
"lint": "standard"
},
{
"name": "my-package",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"start": "node index.js",
"test": "echo \"Error: no test specified\" && exit 1",
"lint": "standard"
},
"author": "",
"license": "ISC",
"keywords": [],
"description": "",
"dependencies": {
"pino": "^8.14.1"
},
"devDependencies": {
"standard": "^17.0.0"
}
}
'use strict'
console.log('my-package started')
process.stdin.resume()
On the command line, with the my-package folder as the current working
directory run the install command:
npm install
As long as Pino is installed, the module that the Pino package exports can be
loaded.
Let's replace the console.log statement in our index.js file with a logger
that we instantiate from the Pino module:.
'use strict'
const pino = require('pino')
const logger = pino()
logger.info('my-package started')
process.stdin.resume()
Now the Pino module has been loaded using require. The require function
is passed a package's namespace, looks for a directory with that name in
the node_modules folder and returns the exported value from the main file of
that package.
Now if we run npm start we should see a JSON formatted log message:
Hit CTRL-C to exit the process.
To understand the full algorithm that require uses to load modules, see
Node.js Documentation, Folders as modules.
'use strict'
We created a function called upper which will convert any input to a string
and convert that string to an upper-cased string. Whatever is assigned
to module.exports will be the value that is returned when the module is
required. The require function returns the module.exports of the module
that it is loading. In this case, module.exports is assigned to an object, with
an upper key on it that references the upper function.
The format.js file can now be loaded into our index.js file as a local
module. Modify index.js to the following:
'use strict'
const pino = require('pino')
const format = require('./format')
const logger = pino()
logger.info(format.upper('my-package started'))
process.stdin.resume()
The format.js file is loaded into the index.js file by passing a path
into require. The extension (.js) is allowed but not necessary.
So require('./format') will return the module.exports value
in format.js, which is an object that has an upper method.
The format.upper method is called within the call to logger.info which
results in an upper-cased string "MY-PACKAGE STARTED" being passed
to logger.info.
Now we have both a package module (pino) and a local module (format.js)
loaded and used in the index.js file.
In its current form, if we require the index.js file it will behave exactly the
same way:
When a file is the entry point of a program, it's the main module. We can
detect whether a particular file is the main module.
'use strict'
const format = require('./format')
But if it's executed with node, it will exhibit the original behavior:
Converting a Local CJS File to a
Local ESM File
EcmaScript Modules (ESM) was introduced to the EcmaScript specification as
part of EcmaScript 2015 (formerly known as EcmaScript 6). One of the main
goals of the specification was for module includes to be statically analyzable,
which allows browsers to pre-parse out imports similar to collecting
any <script> tags as the web page loads.
Due to the complexity involved with retrofitting a static module system into
a dynamic language, it took about three years for major browsers to
implement it. It took even longer for ESM to be implemented in Node.js,
since interoperability with the Node's existing CJS module system has been a
significant challenge - and there are still pain points as we will see.
A crucial difference between CJS and ESM is that CJS loads every module
synchronously and ESM loads every module asynchronously (again, this
shows the specification choices for the native JavaScript module system to
work well in browsers, acting like a script tag).
It's important to differentiate between ESM and what we'll call "faux-ESM".
Faux-ESM is ESM-like syntax that would typically be transpiled with Babel.
The syntax looks similar or even identical, but the behavior can vary
significantly. Faux-ESM in Node compiles to CommonJS, and in the browser
compiles to using a bundled synchronous loader. Either way faux-ESM loads
modules synchronously whereas native ESM loads modules asynchronously.
A Node application (or module) can contain both CJS and ESM files.
Let's convert our format.js file from CJS to ESM. First we'll need to rename
so that it has an .mjs extension:
In a future section, we'll look at converting a whole project to ESM, which
allows us to use .js extensions for ESM files (CJS files then must have
the .cjs extension). For now, we're just converting a single CJS file to an
ESM file.
We no longer need the 'use strict' pragma since ESM modules essentially
execute in strict-mode anyway.
If we now try to execute npm start, we'll see the following failure:
This error occurs because the require function will not automatically resolve
a filename without an extension ('./format') to an .mjs extension. There is
no point fixing this, since attempting to require the ESM file will fail anyway:
Our project is now broken. This is deliberate. In the next section, we'll look at
an (imperfect) way to load an ESM file into a CJS file.
'use strict'
In the next section, we'll convert the entire project to an ESM package.
Converting a CJS Package to an
ESM Package
We can opt-in to ESM-by-default by adding a type field to
the package.json and setting it to "module". Our package.json should look
as follows:
{
"name": "my-package",
"version": "1.0.0",
"main": "index.js",
"type": "module",
"scripts": {
"start": "node index.js",
"test": "echo \"Error: no test specified\" && exit 1",
"lint": "standard"
},
"author": "",
"license": "ISC",
"keywords": [],
"description": "",
"dependencies": {
"pino": "^8.14.1"
},
"devDependencies": {
"standard": "^17.0.0"
}
}
We can also now import our module (within another ESM module) and use it:
Since ESM was primarily made with browsers in mind, there is no concept of
a filesystem or even namespaces in the original ESM specification. In fact,
the use of namespaces or file paths when using Node with ESM is due to the
Node.js implementation of ESM modules, and not actually part of the
specification. But the original ESM specification deals only with URLs, as a
result import.meta.url holds a file:// URL pointing to the file path of the
current module. On a side note, in browsers import maps can be used to map
namespaces and file paths to URLs.
The realpath function we use is from the core fs/promises module. This is
an asynchronous filesystem API that uses promises instead of callbacks. One
compelling feature of modern ESM is Top-Level Await (TLA). Since all ESM
modules load asynchronously it's possible to perform related asynchronous
operations as part of a module's initialization. TLA allows the use of
the await keyword in an ESM modules scope, at the top level, as well as
within async functions. We use TLA to await the promise returned by
each realpath call, and the promise returned by the dynamic import inside
the if statement.
Let's create a file in my-package and call it resolve-demo.cjs, and place the
following code into it:
'use strict'
console.log()
console.group('# package resolution')
console.log(`require('pino')`, '\t', ' =>',
require.resolve('pino'))
console.log(`require('standard')`, '\t', ' =>',
require.resolve('standard'))
console.groupEnd('')
console.log()
If we execute resolve-demo.cjs with node we'll see the resolved path for
each of the require examples:
console.log(
`import 'pino'`,
'=>',
pathToFileURL(require.resolve('pino')).toString()
)
For example, the tap module sets an exports field that points to a .js file by
default, but a .mjs file when imported. See GitHub, tapjs/node-tap. To
demonstrate how using createRequire is insufficient lets install tap into my-
package:
console.log(
`import 'pino'`,
'=>',
pathToFileURL(require.resolve('pino')).toString()
)
console.log(
`import 'tap'`,
'=>',
pathToFileURL(require.resolve('tap')).toString()
)
If we execute the updated file we should see something like the following:
console.log(
`import 'pino'`,
'=>',
await resolve('pino', import.meta.url)
)
console.log(
`import 'tap'`,
'=>',
await resolve('tap', import.meta.url)
)
If we run this file with Node, we should see something like the following:
Callbacks
A callback is a function that will be called at some future point, once a task
has been completed. Until the fairly recent introduction of async/await, which
will be discussed shortly, callback functions were the only way to manage
asynchronous flow.
The fs module (file system operations) will be discussed at length in Chapter
13 but for purposes of illustration, let's take a look at an
example readFile call:
If this is placed into a file and executed the program will read its own source
code and print it out. To understand why it loads itself, it's important to know
that _filename in Node.js holds the path of the file currently being executed.
This is the first argument passed to readFile. The readFile function
schedules a task, which is to read the given file. When the file has been read,
the readFile function will call the function provided as the second
argument.
On line two smallFile, mediumFile, and bigFile are mocked (i.e. it's
pretend) and they're actually all the same file. The actual file they point to
doesn't matter, it only matters that we understand they represent different
file sizes for the purposes of understanding.
If the files were genuinely different sizes, the above would print out the
contents of smallFile first and bigFile last even though
the readFile operation for bigFile was called first. This is one way to
achieve parallel execution in Node.js.
What if we wanted to use serial execution, let's say we want bigFile to print
first, then mediumFile even though they take longer to load than smallFile.
Well now the callbacks have to be placed inside each other:
Serial execution with callbacks is achieved by waiting for the callback to call
before starting the next asynchronous operation.
What if we want all of the contents of each file to be concatenated together
and logged once all files are loaded?
The following example pushes the contents of each file to an array and then
logs the array when all files are loaded:
read(files[index])
The cb function takes two arguments, the first is the error object
or null (depending on whether there was an error). The second is the result
of the asynchronous operation - which is called contents here. The first
parameter of the mapped function (readers) will be whatever the last result
was. Since we don't use that parameter, we assigned the parameter to an
underscore (_) to signal it's not of interest for this case. The final parameter
passed to series is print, this will be called when all the readers have
been processed by fastseries. The second argument of print is
called data here, fastseries will pass an array of all the results to print.
This example using fastseries is not totally equivalent to the prior example
using the index and count variables, because the error handling is different.
In the fastseries example if an error occurs, it's passed to the cb function
and fastseries will call print with the error and then end. However in the
prior example, we call print with the err but continue to read any other files
in the array. To get exactly the same behavior we would have to change
the readers array to the following:
Promises
A promise is an object that represents an asynchronous operation. It's either
pending or settled, and if it is settled it's either resolved or rejected. Being
able to treat an asynchronous operation as an object is a useful abstraction.
For instance, instead of passing a function that should be called when an
asynchronous operation completes into another function (e.g., a callback), a
promise that represents the asynchronous operation can be returned from a
function instead.
myAsyncOperation(functionThatHandlesTheResult)
function myAsyncOperation () {
return new Promise((resolve, reject) => {
doSomethingAsynchronous((err, value) => {
if (err) reject(err)
else resolve(value)
})
})
}
In Node there is a nicer way to this with the promisify function from
the util module:
Generally, the best way to handle promises is with async/await, which will be
discussed later in this chapter. But the methods to handle promise success
or failure are then and catch:
Note that then and catch always return a promise, so these calls can be
chained. First then is called on promise and catch is called on the result
of then (which is a promise).
promise.then((contents) => {
console.log(contents.toString())
})
promise.catch((err) => {
console.error(err)
})
This will result in the file printing itself. Here we have the
same readFile operation as in the last section, but the promisify function
is used to convert a callback-based API to a promise-based one. When it
comes to the fs module we don't actually have to do this, the fs module
exports a promises object with promise-based versions. Let's rewrite the
above in a more condensed form:
readFile(__filename)
.then((contents) => {
console.log(contents.toString())
})
.catch(console.error)
If a value is returned from then, the then method will return a promise that
resolves to that value:
readFile(__filename)
.then((contents) => {
return contents.toString()
})
.then((stringifiedContents) => {
console.log(stringifiedContents)
})
.catch(console.error)
In this case, the first then handler returns a promise that resolves to the
stringified version of contents. So when the second then is called on the
result of the first then the handler of the second then is called with the
stringified contents. Even though an intermediate promise is created by the
first then we still only need the one catch handler as rejections are
propagated.
If a promise is returned from a then handler, the then method will return
that promise, this allows for an easy serial execution pattern:
Once bigFile has been read, the first then handler returns a promise for
reading mediumFile. The second then handler receives the contents
of mediumFile and returns a promise for reading smallFile. The
third then handler is the prints the contents of the smallFile and returns
itself. The catch handler will handle errors from any of the intermediate
promises.
Let's consider the same scenario of the files array that we dealt with in the
previous section. Here's how the same behavior could be achieved with
promises:
read(files[index])
.then((data) => {
print(Buffer.concat(data))
})
.catch(console.error)
Promise.all(readers)
.then(print)
.catch(console.error)
However if one of the promises was to fail, Promise.all will reject, and any
successfully resolved promises are ignored. If we want more tolerance of
individual errors, Promise.allSettled can be used:
Promise.allSettled(readers)
.then(print)
.catch(console.error)
readFile(bigFile).then(print).catch(console.error)
readFile(mediumFile).then(print).catch(console.error)
readFile(smallFile).then(print).catch(console.error)
Next, we'll find even more effective ways of working with promises using
async/await.
Async/Await
The keywords async and await allow for an approach that looks stylistically
similar to synchronous code. The async keyword is used before a function to
declare an async function:
run().catch(console.error)
To start the async function we call it like any other function. An async
function always returns a promise, so we call the catch method to ensure
that any rejections within the async function are handled. For instance,
if readFile had an error, the awaited promise would reject, this would make
the run function reject and we'd handle it in the catch handler.
run().catch(console.error)
Concatenating files after they've been loaded is also trivial with async/await:
run().catch(console.error)
Notice that we did not need to use index or count variables to track
asynchronous execution of operations. We were also able to populate
the data array declaratively instead of pushing state into it. The async/await
syntax allows for declarative asynchronous implementations.
What about the scenario with a files array of unknown length? The
following is an async/await approach to this:
const { readFile } = require('fs').promises
run().catch(console.error)
Here we use an await inside a loop. For scenarios where operations *must*
be sequentially called this is fitting. However for scenarios where the output
only has to be ordered, but the order in which asynchronous operations
resolves is immaterial we can again use Promise.all but this time await the
promise that Promise.all returns:
run().catch(console.error)
results
.filter(({status}) => status === 'rejected')
.forEach(({reason}) => console.error(reason))
print(Buffer.concat(data))
}
run().catch(console.error)
The async/await syntax is highly specialized for serial control flow. The trade-
off is that parallel execution in async functions with
using Promise.all, Promise.allSettled, Promise.any or Promise.race ca
n become difficult or unintuitive to reason about.
To get the exact same parallel operation behavior as in the initial callback
example within an async function so that the files are printed as soon as
they are loaded we have to create the promises, use a then handler and
then await the promises later on:
big.then(print)
medium.then(print)
small.then(print)
await small
await medium
await big
}
run().catch(console.error)
This will ensure the contents are printed out chronologically, according to the
time it took each of them to load. If the complexity for parallel execution
grows it may be better to use a callback based approach and wrap it at a
higher level into a promise so that it can be used in an async/await function:
run().catch(console.error)
Canceling Asynchronous
Operations
Sometimes it turns out that an asynchronous operation doesn't need to
occur after it has already started. One solution is to not start the operation
until it's definitely needed, but this would generally be the slowest
implementation. Another approach is to start the operation, and then cancel
it if conditions change. A standardized approach to canceling asynchronous
operations that can work with fire-and-forget, callback-based and promise-
based APIs and in an async/await context would certainly be welcome. This is
why Node core has embraced the AbortController with AbortSignal Web
APIs.
This code will output nothing, because the timeout is cleared before its
callback can be called. How can we achieve the same thing with a promise-
based timeout? Let's consider the following code (we're using ESM here to
take advantage of Top-Level Await):
setImmediate(() => {
clearTimeout(timeout) // do not do this, it won't work
})
console.log(await timeout)
This code outputs "will be logged" after one second. Instead of using the
global setTimeout function, we're using the setTimeout function exported
from the core timers/promises module. This exported setTimeout function
doesn't need a callback, instead it returns a promise that resolves after the
specified delay. Optionally, the promise resolves to the value of the second
argument. This means that the timeout constant is a promise, which is then
passed to clearTimeout. Since it's a promise and not a timeout
identifier, clearTimeout silently ignores it, so the asynchronous timeout
operation never gets canceled. Below the clearTimeout we log the resolved
promise of the value by passing await timeout to console.log. This is a
good example of when an asynchronous operation has a non-generic
cancelation API that cannot be easily applied to a promisified API that
performs the same asynchronous operation. Other cases could be when a
function returns an instance with a cancel method, or an abort method, or
a destroy method with many other possibilities for method names that could
be used to stop an on-going asynchronous operation. Again this won't work
when returning a simple native promise. This is where accepting
an AbortSignal can provide a conventional escape-hatch for canceling a
promisified asynchronous operation.
setImmediate(() => {
ac.abort()
})
try {
console.log(await timeout)
} catch (err) {
// ignore abort errors:
if (err.code !== 'ABORT_ERR') throw err
}
This now behaves as the typical timeout example, nothing is logged out
because the timer is canceled before it can complete.
The AbortController constructor is a global, so we instantiate it and assign
it to the ac constant. An AbortController instance has
an AbortSignal instance on its signal property. We pass this via the
options argument to timers/promises setTimeout, internally the API will
listen for an abort event on the signal instance and then cancel the
operation if it is triggered. We trigger the abort event on the signal instance
by calling the abort method on the AbortController instance, this causes
the asynchronous operation to be canceled and the promise is fulfilled by
rejecting with an AbortError. An AbortError has a code property with the
value 'ABORT_ERR', so we wrap the await timeout in a try/catch and
rethrow any errors that are not AbortError objects, effectively ignoring
the AbortError.
Emitting Events
To emit an event call the emit method:
The first argument passed to emit is the event namespace. In order to listen
to an event this namespace has to be known. The subsequent arguments will
be passed to the listener.
ee.addListener('close', () => {
console.log(close event fired!')
})
Ordering is important, in the following will the event listener will not fire:
ee.emit('close')
ee.on('close', () => { console.log('close event fired!') })
Listeners are also called in the order that they are registered:
const { EventEmitter } = require('events')
const ee = new EventEmitter()
ee.on('my-event', () => { console.log('1st') })
ee.on('my-event', () => { console.log('2nd') })
ee.emit('my-event')
The prependListener method can be used to inject listeners into the top
position:
Removing Listeners
The removeListener method can be used to remove a previously registered
listener.
The removeListener method takes two arguments, the event name and the
listener function.
In the following example, the listener1 function will be called twice, but
the listener2 function will be called five times:
setInterval(() => {
ee.emit('my-event')
}, 200)
setTimeout(() => {
ee.removeListener('my-event', listener1)
}, 500)
setTimeout(() => {
ee.removeListener('my-event', listener2)
}, 1100)
The following will trigger two 'my-event' listeners twice, but will trigger
the 'another-event' listener five times:
ee.on('my-event', listener1)
ee.on('my-event', listener2)
ee.on('another-event', () => { console.log('another event') })
setInterval(() => {
ee.emit('my-event')
ee.emit('another-event')
}, 200)
setTimeout(() => {
ee.removeAllListeners('my-event')
}, 500)
setTimeout(() => {
ee.removeAllListeners()
}, 1100)
This will cause the process to crash and output an error stack trace:
If a listener is registered for the error event the process will no longer crash:
Execution will pause on the line starting await once, until the registered
event fires. If it never fires, execution will never proceed past that point. This
makes events.once useful in async/await or ESM Top-Level Await scenarios
(we're using ESM for Top-Level Await here), but we need an escape-hatch for
scenarios where an event might not fire. For example the following code will
never output pinged!:
try {
await once(uneventful, 'ping', { signal })
console.log('pinged!')
} catch (err) {
// ignore abort errors:
if (err.code !== 'ABORT_ERR') throw err
console.log('canceled')
}
This code will now output canceled every time. Since uneventful never
emits pinged, after 500 milliseconds ac.abort is called, and this causes the
signal instance passed to events.once to emit an abort event which
triggers events.once to reject the returned promise with an AbortError. We
check for the AbortError, rethrowing if the error isn't related to
the AbortController. If the error is an AbortError we log out canceled.
We can make this a little bit more realistic by making the event listener
sometimes take longer than 500 milliseconds, and sometimes take less than
500 milliseconds:
try {
await once(sometimesLaggy, 'ping', { signal })
console.log('pinged!')
} catch (err) {
// ignore abort errors:
if (err.code !== 'ABORT_ERR') throw err
console.log('canceled')
}
About three out of four times this code will log out canceled, one out of four
times it will log out pinged!. Also note an interesting usage
of AbortController here: ac.abort is used to cancel both
the event.once promise and the first timers/promises
setTimeout promise. The options object must be the third argument with
the timers/promises setTimeout function, the second argument can be
used to specify the resolved value of the timeout promise. In our case we set
the resolved value to null by passing null as the second argument
to timers/promises setTimeout.
Kinds of Errors
Very broadly speaking errors can be divided into two main groups:
1. Operational errors
2. Developer errors
Throwing
Typically, an input error is dealt with by using the throw keyword:
While it's recommended to always throw object instantiated from Error (or
instantiated from a constructor that inherits from Error), it is possible to
throw any value:
doTask(-1)
In this case there is no stack trace because an Error object was not thrown.
As noted in the output the --trace-uncaught flag can be used to track the
exception however this is not ideal. It's highly recommended to only throw
objects that derive from the native Error constructor, either directly or via
inheritance.
There are six other native error constructors that inherit from the
base Error constructor, these are:
EvalError
SyntaxError
RangeError
ReferenceError
TypeError
URIError
These error constructors exist mostly for native JavaScript API's and
functionality. For instance, a ReferenceError will be automatically thrown
by the JavaScript engine when attempting to refer to a non-existent
reference:
Like any object, an error object can have its instance verified:
Notice that, given err is an object created with new SyntaxError(), it is
both an instanceof SyntaxError and an instanceof Error,
because SyntaxError - and all other native errors, inherit from Error.
Native errors objects also have a name property which contains the name of
the error that created it:
For the most part, there's only two of these error constructors that are likely
to be thrown in library or application code, RangeError and TypeError. Let's
update the code from the previous section to use these two error
constructors:
For more information about native errors see MDN web docs - "Error".
Custom Errors
The native errors are a limited and rudimentary set of errors that can never
cater to all possible application errors. There are different ways to
communicate various error cases but we will explore two: subclassing native
error constructors and use a code property. These aren't mutually exclusive.
In our first iteration we'll create an error and add a code property:
doTask(3)
In the next section, we'll see how to intercept and identify errors but when
this error occurs it can be identified by the code value that was added and
then handled accordingly. Node code API's use the approach of creating a
native error (either Error or one of the six constructors that inherit
from Error) adding a code property. For a list of possible error codes
see "Node.js Error Codes".
We can also inherit from Error ourselves to create a custom error instance
for a particular use case. Let's create an OddError constructor:
doTask(3)
When executed with the updated error this results in the following:
Try/Catch
When an error is thrown in a normal synchronous function it can be handled
with a try/catch block.
Using the same code from the previous section, we'll wrap
the doTask(3) function call with a try/catch block:
try {
const result = doTask(3)
console.log('result', result)
} catch (err) {
console.error('Error caught: ', err)
}
try {
const result = doTask(4)
console.log('result', result)
} catch (err) {
console.error('Error caught: ', err)
}
When the invocation is doTask(4), doTask does not throw an error and so
program execution proceeds to the next line, console.log('result',
result), which outputs result 2. When the input is invalid, for
instance doTask(3) the doTask function will throw and so program execution
does not proceed to the next line but instead jumps to the catch block.
Rather than just logging the error, we can determine what kind of error has
occurred and handle it accordingly:
try {
const result = doTask(4)
console.log('result', result)
} catch (err) {
if (err instanceof TypeError) {
console.error('wrong type')
} else if (err instanceof RangeError) {
console.error('out of range')
} else if (err instanceof OddError) {
console.error('cannot be odd')
} else {
console.error('Unknown error', err)
}
}
Let's take the above code but change the input for the doTask call in the
following three ways:
doTask(3)
doTask('here is some invalid input')
doTask(-1)
If we execute the code after each change, each error case will lead to a
different outcome:
The first case causes an instance of our custom OddError constructor to be
thrown, this is detected by checking whether the caught error (err) is an
instance of OddError and then the message cannot be odd is logged. The
second scenario leads to an instance of TypeError to be thrown which is
determined by checking if err is an instance of TypeError in which
case wrong type is output. In the third variation and instance
of RangeError is thrown, the caught error is determined to be an instance
of RangeError and then out of range is printed to the terminal.
try {
const result = doTask(4)
result()
console.log('result', result)
} catch (err) {
if (err instanceof TypeError) {
console.error('wrong type')
} else if (err instanceof RangeError) {
console.error('out of range')
} else if (err.code === 'ERR_MUST_BE_EVEN') {
console.error('cannot be odd')
} else {
console.error('Unknown error', err)
}
}
Let's write a small utility function for adding a code to an error object:
Now we'll pass the TypeError and RangeError objects to codify with context
specific error codes:
Finally we can update the catch block to check for the code property instead
of using an instance check:
try {
const result = doTask(4)
result()
console.log('result', result)
} catch (err) {
if (err.code === 'ERR_AMOUNT_MUST_BE_NUMBER') {
console.error('wrong type')
} else if (err.code === 'ERRO_AMOUNT_MUST_EXCEED_ZERO') {
console.error('out of range')
} else if (err.code === 'ERR_MUST_BE_EVEN') {
console.error('cannot be odd')
} else {
console.error('Unknown error', err)
}
}
Now erroneously calling result as a function will cause the error checks to
reach the final else branch in the catch block:
It's important to realize that try/catch cannot catch errors that are thrown
in a callback function that is called at some later point. Consider the
following:
The doTask(3) call will throw an OddError error, but this will not be handled
in the catch block because the function passed to setTimeout is called a
hundred milliseconds later. By this time the try/catch block has already been
executed, so this will result in the error not being handled:
setTimeout(() => {
try {
const result = doTask(3)
console.log('result', result)
} catch (err) {
if (err.code === 'ERR_AMOUNT_MUST_BE_NUMBER') {
console.error('wrong type')
} else if (err.code === 'ERRO_AMOUNT_MUST_EXCEED_ZERO') {
console.error('out of range')
} else if (err.code === 'ERR_MUST_BE_EVEN') {
console.error('cannot be odd')
} else {
console.error('Unknown error', err)
}
}
}, 100)
Rejections
In Chapter 8, we explored asynchronous syntax and patterns focusing on
callback patterns, Promise abstractions and async/await syntax. So far we
have dealt with errors that occur in a synchronous code. Meaning, that
a throw occurs in a normal synchronous function (one that isn't async/await,
promise-based or callback-based). When a throw in a synchronous context is
known as an exception. When a promise rejects, it's representing an
asynchronous error. One way to think about exceptions and rejections is that
exceptions are synchronous errors and rejections are asynchronous errors.
Let's imagine that doTask has some asynchronous work to do, so we can use
a callback based API or we can use a promise-based API
(even async/await is promise-based).
doTask(3)
The promise is created using the Promise constructor, see MDN web docs
- "Constructor Syntax" for full details. The function passed to Promise is
called the tether function, it takes two
arguments, resolve and reject which are also functions. We
call resolve when the operation is a success, or reject when it is a failure.
In this conversion, we're passing an error into reject for each of our error
cases so that the returned promise will reject when doTask is passed invalid
input.
doTask(3)
.then((result) => {
console.log('result', result)
})
.catch((err) => {
if (err.code === 'ERR_AMOUNT_MUST_BE_NUMBER') {
console.error('wrong type')
} else if (err.code === 'ERRO_AMOUNT_MUST_EXCEED_ZERO') {
console.error('out of range')
} else if (err.code === 'ERR_MUST_BE_EVEN') {
console.error('cannot be odd')
} else {
console.error('Unknown error', err)
}
})
Let's modify the then handler so that a throw occurs inside the handler
function:
doTask(4)
.then((result) => {
throw Error('spanner in the works')
})
.catch((err) => {
if (err instanceof TypeError) {
console.error('wrong type')
} else if (err instanceof RangeError) {
console.error('out of range')
} else if (err.code === 'ERR_MUST_BE_EVEN') {
console.error('cannot be odd')
} else {
console.error('Unknown error', err)
}
})
Async Try/Catch
The async/await syntax supports try/catch of rejections. In other words we
can use try/catch on asynchronous promise-based APIs instead of
using then and catch handler as in the next section, let's create a async
function named run and reintroduce the same try/catch pattern that was
used when calling the synchronous form of doTask:
run()
The only difference, other than wrapping the try/catch in an async function,
is that we await doTask(3) so that the async function can handle the
promise automatically. Since 3 is an odd number, the promise returned
from doTask will call reject with our custom OddError and the catch block
will identify the code property and then output cannot be odd:
Propagation
Error propagation is where, instead of handling the error, we make it the
responsibility of the caller instead. We have a doTask function that may
throw, and a run function which calls doTask and handles the error. When
using async/await functions if we want to propagate an error we simply
rethrow it.
function run () {
try {
const result = doTask('not a valid input')
console.log('result', result)
} catch (err) {
if (err.code === 'ERR_AMOUNT_MUST_BE_NUMBER') {
throw Error('wrong type')
} else if (err.code === 'ERRO_AMOUNT_MUST_EXCEED_ZERO') {
throw Error('out of range')
} else if (err.code === 'ERR_MUST_BE_EVEN') {
throw Error('cannot be odd')
} else {
throw err
}
}
}
In addition to removing the async keyword remove the await keyword from
within the try block of the run function because we're now back to dealing
with synchronous execution. The doTask function returns a number again,
instead of a promise. The run function is also now synchronous, since
the async keyword was removed it no longer returns a promise. This means
we can't use a catch handler, but we can use try/catch as normal. The net
effect is that now a normal exception is thrown and handled in
the catch block outside of run.
Finally for the sake of exhaustive exploration of error propagation, we'll look
at the same example using callback-based syntax. In Chapter 8, we explore
error-first callbacks, convert doTask to pass errors as the first argument of a
callback:
Similarly the run function has to be adapted to take a callback (cb) so that
errors can propagate via that callback function. When calling doTask we
need to now supply a callback function and check whether the
first err argument of the callback is truthy to generate the equivalent of a
catch block:
console.log('result', result)
})
}
run((err) => {
if (err) console.error('Error caught', err)
})
Finally, at the end of the above code we call run and pass it a callback
function, which checks whether the first argument (err) is truthy and if it is
the error is logged as the way as in the other two forms:
When the Buffer constructor was first introduced into Node.js the JavaScript
language did not have a native binary type. As the language evolved
the ArrayBuffer and a variety of Typed Arrays were introduced to provide
different "views" of a buffer. For example, an ArrayBuffer instance be
accessed with a Float64Array where each set of 8 bytes is interpreted as a
64-bit floating point number, or an Int32Array where each 4 bytes
represents a 32bit, two's complement signed integer or a Uint8Array where
each byte represents an unsigned integer between 0-255. For more info and
a full list of possible typed arrays see "JavaScript Typed Arrays" by MDN web
docs.
This means there are additional API's that can be availed of beyond the
Buffer methods. For more information, see "Uint8Array" by MDN web docs.
And for a full list of the Buffers API's which sit on top of the Uint8Array API
see Node.js Documentation.
Allocating Buffers
Usually a constructor would be called with the new keyword, however
with Buffer this is deprecated and advised against. Do not instantiate
buffers using new.
The correct way to allocate a buffer of a certain amount of bytes is to
use Buffer.alloc:
One of the reasons that new Buffer is deprecated is because it used to have
the Buffer.unsafeAlloc behavior and now has the Buffer.alloc behavior
which means using new Buffer will have a different outcome on older Node
versions. The other reason is that new Buffer also accepts strings.
The key take-away from this section is: if we need to safely create a buffer,
use Buffer.alloc.
Even though there is one character in the string, it has a length of 2. This is
to do with how Unicode symbols work, but explaining the reasons for this in
depth are far out of scope for this subject. However, for a full deep dive into
reasons for a single character string having a length of 2 see the following
article "JavaScript Has a Unicode Problem" by Mathias Bynes.
It can also result in different buffer sizes, with UTF16LE encoding the
character A is two bytes whereas 'A'.length would be 1.
The supported byte-to-text encodings are hex and base64. Supplying one of
these encodings allows us to represent the data in a string, this can be useful
for sending data across the wire in a safe format.
Calling decoder.write will output a character only when all of the bytes
representing that character have been written to the decoder:
Stream Types
The Node core stream module exposes six constructors for creating streams:
Stream
Readable
Writable
Duplex
Transform
PassThrough
The Stream constructor is the default export of the stream module and
inherits from the EventEmitter constructor from the events module.
The Stream constructor is rarely used directly, but is inherited from by the
other constructors.
The only thing the Stream constructor implements is the pipe method, which
we will cover later in this section.
The main events emitted by various Stream implementations that one may
commonly encounter in application-level code are:
data
end
finish
close
error
The data and end events will be discussed on the "Readable Streams" page
later in this section, the finish is emitted by Writable streams when there
is nothing left to write.
The close and error events are common to all streams. The error event
may be emitted when a stream encounters an error, the close event may be
emitted if a stream is destroyed which may happen if an underlying resource
is unexpectedly closed. It's noteworthy that there are four events that could
signify the end of a stream. On the "Determining End-of-Stream" page
further in this section, we'll discuss a utility function that makes it easier to
detect when a stream has ended.
Stream Modes
There are two stream modes:
Binary streams
Object streams
In object mode streams can read or write JavaScript objects and all primitives
(strings, numbers) except null, so the name is a slight misnomer. In Node
core, most if not all object mode streams deal with strings. On the next
pages the differences between these two modes will be covered as we
explore the different stream types.
Readable Streams
The Readable constructor creates readable streams. A readable stream
could be used to read a file, read data from an incoming HTTP request, or
read user input from a command prompt to name a few examples.
The Readable constructor inherits from the Stream constructor which inherits
from the EventEmitter constructor, so readable streams are event emitters.
As data becomes available, a readable stream emits a data event.
'use strict'
const fs = require('fs')
const readable = fs.createReadStream(__filename)
readable.on('data', (data) => { console.log(' got data', data) })
readable.on('end', () => { console.log(' finished reading') })
Readable streams are usually connected to an I/O layer via a C-binding, but
we can create a contrived readable stream ourselves using
the Readable constructor:
'use strict'
const { Readable } = require('stream')
const createReadStream = () => {
const data = ['some', 'data', 'to', 'read']
return new Readable({
read () {
if (data.length === 0) this.push(null)
else this.push(data.shift())
}
})
}
const readable = createReadStream()
readable.on('data', (data) => { console.log('got data', data) })
readable.on('end', () => { console.log('finished reading') })
When this is executed four data events are emitted, because our
implementation pushes each item in the stream. The read method we supply
to the options object passed to the Readable constructor takes
a size argument which is used in other implementations, such as reading a
file, to determine how many bytes to read. As we discussed, this would
typically be the value set by the highWaterMark option which defaults to
16kb.
Notice how we pushed strings to our readable stream but when we pick them
up in the data event they are buffers. Readable streams emit buffers by
default, which makes sense since most use-cases for readable streams deal
with binary data.
'use strict'
const { Readable } = require('stream')
const createReadStream = () => {
const data = ['some', 'data', 'to', 'read']
return new Readable({
encoding: 'utf8',
read () {
if (data.length === 0) this.push(null)
else this.push(data.shift())
}
})
}
const readable = createReadStream()
readable.on('data', (data) => { console.log('got data', data) })
readable.on('end', () => { console.log('finished reading') })
If we were to run this example code again with this one line changed, we
would see the following:
Now when each data event is emitted it receives a string instead of a buffer.
However because the default stream mode is objectMode: false, the string
is pushed to the readable stream, converted to a buffer and then decoded to
a string using UTF8.
'use strict'
const { Readable } = require('stream')
const createReadStream = () => {
const data = ['some', 'data', 'to', 'read']
return new Readable({
objectMode: true,
read () {
if (data.length === 0) this.push(null)
else this.push(data.pop())
}
})
}
const readable = createReadStream()
readable.on('data', (data) => { console.log('got data', data) })
readable.on('end', () => { console.log('finished reading') })
However this time the string is being sent from the readable stream without
converting to a buffer first.
Our code example can be condensed further using the Readable.from utility
method which creates streams from iterable data structures, like arrays:
'use strict'
const { Readable } = require('stream')
const readable = Readable.from(['some', 'data', 'to', 'read'])
readable.on('data', (data) => { console.log('got data', data) })
readable.on('end', () => { console.log('finished reading') })
This will result in the same output, the data events will receive the data as
strings.
Writable Streams
The Writable constructor creates writable streams. A writable stream could
be used to write a file, write data to an HTTP response, or write to the
terminal. The Writable constructor inherits from the Stream constructor
which inherits from the EventEmitter constructor, so writable streams are
event emitters.
'use strict'
const fs = require('fs')
const writable = fs.createWriteStream('./out')
writable.on('finish', () => { console.log('finished writing') })
writable.write('A\n')
writable.write('B\n')
writable.write('C\n')
writable.end('nothing more to write')
The write method can be called multiple times, the end method will also
write a final payload to the stream before ending it. When the stream is
ended, the finish event is emitted. Our example code will take the string
inputs, convert them to Buffer instance and then write them to the out file.
Once it writes the final line it will output finished writing:
As with the read stream example, let's not focus on the fs module at this
point, the characteristics of writable streams are universal.
Also similar to readable streams, writable streams are mostly useful for I/O,
which means integrating a writable stream with a native C-binding, but we
can likewise create a contrived write stream example:
'use strict'
const { Writable } = require('stream')
const createWriteStream = (data) => {
return new Writable({
write (chunk, enc, next) {
data.push(chunk)
next()
}
})
}
const data = []
const writable = createWriteStream(data)
writable.on('finish', () => { console.log('finished writing',
data) })
writable.write('A\n')
writable.write('B\n')
writable.write('C\n')
writable.end('nothing more to write')
In our implementation we add each chunk to the data array that we pass
into our createWriteStream function.
'use strict'
const { Writable } = require('stream')
const createWriteStream = (data) => {
return new Writable({
decodeStrings: false,
write (chunk, enc, next) {
data.push(chunk)
next()
}
})
}
const data = []
const writable = createWriteStream(data)
writable.on('finish', () => { console.log('finished writing',
data) })
writable.write('A\n')
writable.write(1)
writable.end('nothing more to write')
The above code would result in an error, causing the process to crash
because we're attempting to write a JavaScript value that isn't a string to a
binary stream:
Stream errors can be handled to avoid crashing the process, because
streams are event emitters and the same special case for the error event
applies. We'll explore that more on the "Determining End-of-Stream" page
later in this section.
If we want to support strings and any other JavaScript value, we can instead
set objectMode to true to create an object-mode writable stream:
'use strict'
const { Writable } = require('stream')
const createWriteStream = (data) => {
return new Writable({
objectMode: true,
write (chunk, enc, next) {
data.push(chunk)
next()
}
})
}
const data = []
const writable = createWriteStream(data)
writable.on('finish', () => { console.log('finished writing',
data) })
writable.write('A\n')
writable.write(1)
writable.end('nothing more to write')
Readable-Writable Streams
In addition to the Readable and Writable stream constructors there are
three more core stream constructors that have both readable and writable
interfaces:
Duplex
Transform
PassThrough
We will explore consuming all three, but only create the most common user
stream: the Transform stream.
'use strict'
const net = require('net')
net.createServer((socket) => {
const interval = setInterval(() => {
socket.write('beat')
}, 1000)
socket.on('data', (data) => {
socket.write(data.toString().toUpperCase())
})
socket.on('end', () => { clearInterval(interval) })
}).listen(3000)
In order to interact with our server, we'll also create a small client. The client
socket is also a Duplex stream:
'use strict'
const net = require('net')
const socket = net.connect(3000)
The net.connect method returns a Duplex stream which represents the TCP
client socket.
We listen for data events and log out the incoming data buffers, converting
them to strings for display purposes. On the writable side,
the socket.write method is called with a string, after three and a quarter
seconds another payload is written, and another quarter second later the
stream is ended by calling socket.end.
If we start both of the code examples as separate processes we can view the
interaction:
The purpose of this example is not to understand the net module in its
entirety but to understand that it exposes a common API abstraction,
a Duplex stream and to see how interaction with a Duplex stream works.
'use strict'
const { createGzip } = require('zlib')
const transform = createGzip()
transform.on('data', (data) => {
console.log('got gzip data', data.toString('base64'))
})
transform.write('first')
setTimeout(() => {
transform.end('second')
}, 500)
The way that Transform streams create this causal relationship is through
how a transform stream is created. Instead of
supplying read and write options functions, a transform option is passed to
the Transform constructor:
'use strict'
const { Transform } = require('stream')
const { scrypt } = require('crypto')
const createTransformStream = () => {
return new Transform({
decodeStrings: false,
encoding: 'hex',
transform (chunk, enc, next) {
scrypt(chunk, 'a-salt', 32, (err, key) => {
if (err) {
next(err)
return
}
next(null, key)
})
}
})
}
const transform = createTransformStream()
transform.on('data', (data) => {
console.log('got data:', data)
})
transform.write('A\n')
transform.write('B\n')
transform.write('C\n')
transform.end('nothing more to write')
The transform option function has the same signature as the write option
function passed to Writable streams. It accepts chunk, enc and
the next function. However, in the transform option function
the next function can be passed a second argument which should be the
result of applying some kind of transform operation to the incoming chunk.
The crypto.scrypt callback is called once a key is derived from the inputs,
or may be called if there was an error. In the event of an error we pass the
error object to the next callback. In that scenario this would cause our
transform stream to emit an error event. In the success case we
call next(null, key). Passing the first argument as null indicates that
there was no error, and the second argument is emitted as a data event
from the readable side of the stream. Once we've instantiated our stream
and assigned it to the transform constant, we write some payloads to the
stream and then log out the hex strings we receive in the data event
listener. The data is received as hex because we set the encoding option
(part of the Readable stream options) to dictate that emitted data would be
decoded to hex format. This produces the following result:
The PassThrough constructor inherits from the Transform constructor. It's
essentially a transform stream where no transform is applied. For those
familiar with Functional Programming this has similar applicability to
the identity function ((val) => val), that is, it's a useful placeholder
when a transform stream is expected but no transform is desired. See Lab
12.2 "Create a Transform Stream" to see an example of PassThrough being
used.
Determining End-of-Stream
As we discussed earlier, there are at least four ways for a stream to
potentially become inoperative:
close event
error event
finish event
end event
We often need to know when a stream has closed so that resources can be
deallocated, otherwise memory leaks become likely.
'use strict'
const net = require('net')
const { finished } = require('stream')
net.createServer((socket) => {
const interval = setInterval(() => {
socket.write('beat')
}, 1000)
socket.on('data', (data) => {
socket.write(data.toString().toUpperCase())
})
finished(socket, (err) => {
if (err) {
console.error('there was a socket error', err)
}
clearInterval(interval)
})
}).listen(3000)
Piping Streams
We can now put everything we've learned together and discover how to use
a terse yet powerful abstraction: piping. Piping has been available in
command line shells for decades, for instance here's a common Bash
command:
The pipe operator instructs the console to read the stream of output coming
from the left-hand command (cat some-file) and write that data to the
right-hand command (grep find-something). The concept is the same in
Node, but the pipe method is used.
Let's adapt the TCP client server from the "Readable-Writable Streams" page
to use the pipe method. Here is the client server from earlier:
'use strict'
const net = require('net')
const socket = net.connect(3000)
socket.write('hello')
setTimeout(() => {
socket.write('all done')
setTimeout(() => {
socket.end()
}, 250)
}, 3250)
'use strict'
const net = require('net')
const socket = net.connect(3000)
socket.pipe(process.stdout)
socket.write('hello')
setTimeout(() => {
socket.write('all done')
setTimeout(() => {
socket.end()
}, 250)
}, 3250)
Starting the example server from earlier and running the modified client
results in the following:
Since pipe returns the stream passed to it, it is possible to chain pipe calls
together: streamA.pipe(streamB).pipe(streamC). This is a commonly
observed practice, but it's also bad practice to create pipelines this way. If a
stream in the middle fails or closes for any reason, the other streams in the
pipeline will not automatically close. This can create severe memory leaks
and other bugs. The correct way to pipe multiple streams is to use
the stream.pipeline utility function.
Let's combine the Transform stream we created on the "Readable-Writable
Streams" pages and the TCP server as we modified it on the "Determining
End-of-Stream" pages in order to create a pipeline of streams:
'use strict'
const net = require('net')
const { Transform, pipeline } = require('stream')
const { scrypt } = require('crypto')
const createTransformStream = () => {
return new Transform({
decodeStrings: false,
encoding: 'hex',
transform (chunk, enc, next) {
scrypt(chunk, 'a-salt', 32, (err, key) => {
if (err) {
next(err)
return
}
next(null, key)
})
}
})
}
net.createServer((socket) => {
const transform = createTransformStream()
const interval = setInterval(() => {
socket.write('beat')
}, 1000)
pipeline(socket, transform, socket, (err) => {
if (err) {
console.error('there was a socket error', err)
}
clearInterval(interval)
})
}).listen(3000)
If we start both the modified TCP server and modified TCP client this will lead
to the following result:
The first 64 characters are the hex representation of a key derived from
the 'hello' string that the client Node process wrote to the client
TCP socket Duplex stream. This was emitted as a data event on the
TCP socket Duplex stream in the server Node process. It was then
automatically written to our transform stream instance, which derived a key
using crypto.scrypt within the transform option passed to
the Transform constructor in our createTransformStream function. The
result was then passed as the second argument of the next callback. This
then resulted in a data event being emitted from the transform stream with
the hex string of the derived key. That data was then written back to the
server-side socket stream. Back in the client Node process, this incoming
data was emitted as a data event by the client-side socket stream and
automatically written to the process.stdout writable stream by the client
Node process. The next 12 characters are the three beats written at one
second intervals in the server. The final 64 characters are the hex
representation of the derived key of the 'all done' string written to the
client side socket. From there that payload goes through the exact same
process as the first 'hello' payload.
The pipeline command will call pipe on every stream passed to it, and will
allow a function to be passed as the final function. Note how we removed
the finished utility method. This is because the final function passed to
the pipeline function will be called if any of the streams in the pipeline
close or fail for any reason.
Streams are a very large subject, this section has cut a pathway to becoming
both productive and safe with streams. See Node.js Documentation to get
even deeper on streams.
Before locating a relative file path, we often need to know where the
particular file being executed is located. For this there are two variables that
are always present in every module: __filename and __dirname.
The __filename variable holds the absolute path to the currently executing
file, and the __dirname variable holds the absolute path to the directory that
the currently executing file is in.
'use strict'
console.log('current filename', __filename)
console.log('current dirname', __dirname)
'use strict'
const { join } = require('path')
console.log('out file:', join(__dirname, 'out.txt'))
Apart from path.isAbsolute which as the name suggests will return true if
a given path is absolute, the available path methods can be broadly divided
into path builders and path deconstructors.
path.relative
Given two absolute paths, calculates the relative path between them.
path.resolve
Accepts multiple string arguments representing paths. Conceptually
each path represents navigation to that path.
The path.resolve function returns a string of the path that would
result from navigating to each of the directories in order using the
command line cd command. For instance path.resolve('/foo',
'bar', 'baz') would return '/foo/bar/baz', which is akin to
executing cd /foo then cd bar then cd baz on the command line, and
then finding out what the current working directory is.
path.normalize
Resolves .. and . dot in paths and strips extra slashes, for
instance path.normalize('/foo/../bar//baz') would
return '/bar/baz'.
path.format
Builds a string from an object. The object shape
that path.format accepts, corresponds to the object returned
from path.parse which we'll explore next.
The path deconstructors
are path.parse, path.extname, path.dirname and path.basename. Let's
explore these with a code example:
'use strict'
const { parse, basename, dirname, extname } = require('path')
console.log('filename parsed:', parse(__filename))
console.log('filename basename:', basename(__filename))
console.log('filename dirname:', dirname(__filename))
console.log('filename extname:', extname(__filename))
On Windows the output would be similar except the root property of the
parsed object would contain the drive letter, e.g. 'C:\\' and both
the dir property and the result of the dirname method would return paths
with a drive letter and backslashes instead of forward slashes.
The parse method returns an object with root, dir, base, ext,
and name properties. The root and name values can only be ascertained with
the path module by using the parse method.
The base, dir and ext properties can be individually calculated with
the path.dirname and path.basename methods respectively.
This section has provided an overview with focus on common usage. Refer to
the Node core path Documentation to learn more.
The higher level methods for reading and writing are provided in four
abstraction types:
Synchronous
Callback based
Promise based
Stream based
All the names of synchronous methods in the fs module end with Sync. For
instance, fs.readFileSync. Synchronous methods will block anything else
from happening in the process until they have resolved. These are
convenient for loading data when a program starts, but should mostly be
avoided after that. If a synchronous method stops anything else from
happening, it means the process can't handle or make requests or do any
kind of I/O until the synchronous operation has completed.
'use strict'
const { readFileSync } = require('fs')
const contents = readFileSync(__filename)
console.log(contents)
The above code will synchronously read its own contents into a buffer and
then print the buffer:
'use strict'
const { readFileSync } = require('fs')
const contents = readFileSync(__filename, {encoding: 'utf8'})
console.log(contents)
'use strict'
const { join } = require('path')
const { readFileSync, writeFileSync } = require('fs')
const contents = readFileSync(__filename, {encoding: 'utf8'})
writeFileSync(join(__dirname, 'out.txt'), contents.toUpperCase())
In this example, instead of logging the contents out, we've upper cased the
contents and written it to an out.txt file in the same directory:
An options object can be added, with a flag option set to 'a' to open a file
in append mode:
'use strict'
const { join } = require('path')
const { readFileSync, writeFileSync } = require('fs')
const contents = readFileSync(__filename, {encoding: 'utf8'})
writeFileSync(join(__dirname, 'out.txt'), contents.toUpperCase(),
{
flag: 'a'
})
If we run that same code again the out.txt file will have the altered code
added to it:
For a full list of supports flags, see File System Flags section of the Node.js
Documentation.
In the case of the *Sync, APIs control flow is very simple because execution
is sequential, the chronological ordering maps directly with the order of
instructions in the file. However, Node works best when I/O is managed in
the background until it is ready to be processed. For this, there's the callback
and promise based filesystem APIs. The asynchronous control flow was
discussed at length in Chapter 8, the choice on which abstraction to use
depends heavily on project context. So let's explore both, starting with
callback-based reading and writing.
'use strict'
const { readFile } = require('fs')
readFile(__filename, {encoding: 'utf8'}, (err, contents) => {
if (err) {
console.error(err)
return
}
console.log(contents)
})
When the process is executed this achieves the same objective, it will print
the file contents to the terminal:
However, the actual behavior of the I/O operation and the JavaScript engine
is different. In the readFileSync case execution is paused until the file has
been read, whereas in this example execution is free to continue while the
read operation is performed. Once the read operation is completed, then the
callback function that we passed as the third argument to readFile is called
with the result. This allows for the process to perform other tasks (accepting
an HTTP request for instance).
'use strict'
const { join } = require('path')
const { readFile, writeFile } = require('fs')
readFile(__filename, {encoding: 'utf8'}, (err, contents) => {
if (err) {
console.error(err)
return
}
const out = join(__dirname, 'out.txt')
writeFile(out, contents.toUpperCase(), (err) => {
if (err) { console.error(err) }
})
})
The fs/promises API provides most of the same asynchronous methods that
are available on fs, but the methods return promises instead of accepting
callbacks.
Let's look at the same reading and writing example using fs/promises and
using async/await to resolve the promises:
'use strict'
const { join } = require('path')
const { readFile, writeFile } = require('fs/promises')
async function run () {
const contents = await readFile(__filename, {encoding: 'utf8'})
const out = join(__dirname, 'out.txt')
await writeFile(out, contents.toUpperCase())
}
run().catch(console.error)
File Streams
Recall from the previous section that the fs module has four API types:
Synchronous
Callback-based
Promise-based
Stream-based
The fs module
has fs.createReadStream and fs.createWriteStream methods which allow
us to read and write files in chunks. Streams are ideal when handling very
large files that can be processed incrementally.
'use strict'
const { pipeline } = require('stream')
const { join } = require('path')
const { createReadStream, createWriteStream } = require('fs')
pipeline(
createReadStream(__filename),
createWriteStream(join(__dirname, 'out.txt')),
(err) => {
if (err) {
console.error(err)
return
}
console.log('finished writing')
}
)
This pattern is excellent if dealing with a large file because the memory
usage will stay constant as the file is read in small chunks and written in
small chunks.
'use strict'
const { pipeline } = require('stream')
const { join } = require('path')
const { createReadStream, createWriteStream } = require('fs')
const { Transform } = require('stream')
const createUppercaseStream = () => {
return new Transform({
transform (chunk, enc, next) {
const uppercased = chunk.toString().toUpperCase()
next(null, uppercased)
}
})
}
pipeline(
createReadStream(__filename),
createUppercaseStream(),
createWriteStream(join(__dirname, 'out.txt')),
(err) => {
if (err) {
console.error(err)
return
}
console.log('finished writing')
}
)
Our pipeline now reads chunks from the file read stream, sends them
through our transform stream where they are upper-cased and then sent on
to the write stream to achieve the same result of upper-casing the content
and writing it to out.txt:
If necessary, review Chapter 12 again to fully understand this example.
Reading Directories
Directories are a special type of file, which hold a catalog of files. Similar to
files the fs module provides multiple ways to read a directory:
Synchronous
Callback-based
Promise-based
An async iterable that inherits from fs.Dir
While it will be explored here, going into depth on the last bullet point is
beyond the scope of this chapter, but see Class fs.Dir of the Node.js
Documentation for more information.
The pros and cons of each API approach is the same as reading and writing
files. Synchronous execution is recommended against when asynchronous
operations are relied upon (such as when serving HTTP requests). Callback or
promise-based are best for most cases. The stream-like API would be best for
extremely large directories.
example.js
file-a
file-b
file-c
The example.js file would be the file that executes our code. Let's look at
synchronous, callback-based and promise-based at the same time:
'use strict'
const { readdirSync, readdir } = require('fs')
const { readdir: readdirProm } = require('fs/promises')
try {
console.log('sync', readdirSync(__dirname))
} catch (err) {
console.error(err)
}
run().catch((err) => {
console.error(err)
})
The second section used the readdir callback method, once the directory
has been read the second argument (our callback function) will be called
with the second argument being an array of files in the provided directory (in
each example we've used __dirname, the current directory). In the case of
an error the first argument of our callback function will be an error object, so
we check for it and handle it, returning early from the function. In the
success case, the files are logged out with console.log.
This course does not attempt to cover HTTP, for that see the sibling
course, Node.js Services Development (LFW212). However, for the final part
of this section we'll examine a more advanced case: streaming directory
contents over HTTP in JSON format:
'use strict'
const { createServer } = require('http')
const { Readable, Transform, pipeline } = require('stream')
const { opendir } = require('fs')
File Metadata
Metadata about files can be obtained with the following methods:
Let's start by reading the current working directory and finding out whether
each entry is a directory or not.
'use strict'
const { readdirSync, statSync } = require('fs')
Since '.' is passed to readdirSync, the directory that will be ready will be
whatever directory we're currently in.
Given a directory structure with the following:
example.js
a-dir
a-file
Where example.js is the file with our code in, if we run node example.js in
that folder, we'll see something like the following:
Let's extend our example with time stats. There are four stats available for
files:
Access time
Change time
Modified time
Birth time
The difference between change time and modified time, is modified time
only applies to writes (although it can be manipulated by fs.utime),
whereas change time applies to writes and any status changes such as
changing permissions or ownership.
With default options, the time stats are offered in two formats, one is
a Date object and the other is milliseconds since the epoch. We'll use the
Date objects and convert them to locale strings.
Let's update our code to output the four different time stats for each file:
'use strict'
const { readdirSync, statSync } = require('fs')
Let's start by writing watching the current directory and logging file names
and events:
'use strict'
const { watch } = require('fs')
The following screenshot shows the above code running in the top terminal,
and file manipulation commands in the bottom section.
The output in the top section is output in real time for each command in the
bottom section. Let's analyze the commands in the bottom section to the
output in the top section:
It may be obvious at this point that the supplied event isn't very useful.
The fs.watch API is part of the low-level functionality of the fs module, it's
repeating the events generated by the underlying operating system. So we
can either use a library like chokidar as discussed at the beginning of this
section or we can query and store information about files to determine that
operations are occurring.
'use strict'
const { join, resolve } = require('path')
const { watch, readdirSync, statSync } = require('fs')
If we execute our code, and the add a new file and delete it, it will output
more suitable event names:
process.stdin
A Readable stream for process input.
process.stdout
A Writable stream for process output.
process.stderr
A Writable stream for process error output.
Streams were covered in detail earlier on, for any terms that seem
unfamiliar, refer back to Chapter 12.
node -p "crypto.randomBytes(100).toString('hex')"
Since bytes are randomly generated, this will produce different output every
time, but it will always be 200 alphanumeric characters:
Let's start with an example.js file that simply prints that it was initialized
and then exits:
'use strict'
console.log('initialized')
If we attempt to use the command line to pipe the output from the random
byte command into our process, nothing will happen beyond the process
printing that it was initialized:
Let's extend our code so that we
connect process.stdin to process.stdout:
'use strict'
console.log('initialized')
process.stdin.pipe(process.stdout)
This will cause the input that we're piping from the random bytes command
into our process will be written out from our process:
Since we're dealing with streams, we can take the uppercase stream from
the previous chapter and pipe from process.stdin through the uppercase
stream and out to process.stdout:
'use strict'
console.log('initialized')
const { Transform } = require('stream')
const createUppercaseStream = () => {
return new Transform({
transform (chunk, enc, next) {
const uppercased = chunk.toString().toUpperCase()
next(null, uppercased)
}
})
}
process.stdin.pipe(uppercase).pipe(process.stdout)
It may have been noted that we did not use the pipeline function, but
instead used the pipe method.
The process.stdin, process.stdout and process.stderr streams are
unique in that they never finish, error or close. That is to say, if one of these
streams were to end it would either cause the process to crash or it would
end because the process exited. We could use the stream.finished method
to check that the uppercase stream doesn't close, but in our case we didn't
add error handling to the uppercase stream because any problems that
occur in this scenario should cause the process to crash.
If we execute our code without piping to it, the printed message will indicate
that the process is directly connected to the terminal, and we will be able to
type input into our process which will be transformed and sent back to us:
We can see from this that using the greater than character (>) on the
command line sends output to a given file, in our case out.txt.
To:
While it's beyond the scope of Node, it's worth knowing that if we wanted to
redirect the STDERR output to another file on the command line 2> can be
used:
This command wrote STDOUT to out.txt and STDERR to err.txt. On both
Windows and POSIX systems (Linux, macOS) the number 2 is a common file
handle representing STDERR, this is why the syntax is 2>. In node
the process.stderr.fd is 2 and process.stdout.fd is 1 because they are
file write streams. It's actually possible to recreate them with the fs module:
'use strict'
const fs = require('fs')
const myStdout = fs.createWriteStream(null, {fd: 1})
const myStderr = fs.createWriteStream(null, {fd: 2})
myStdout.write('stdout stream')
myStderr.write('stderr stream')
Exiting
When a process has nothing left to do, it exits by itself. For instance, let's
look at this code:
'use strict'
setInterval(() => {
console.log('this interval is keeping the process open')
}, 500)
If we run the above code the log line will continue to print every 500ms, we
can use Ctrl and C to exit:
This will cause the process to exit after the function passed
to setInterval has been called three times:
When exiting a process an exit status code can already be set. Status codes
are a large subject, and can mean different things on different platforms. The
only exit code that has a uniform meaning across platforms is 0. An exit code
of 0 means the process executed successfully. On Linux and macOS (or more
specifically, Bash, Zsh, Sh, and other *nix shells) we can verify this with the
command echo $? which prints a special variable called $?. On a
Windows cmd.exe terminal we can use echo %ErrorLevel% instead or in
PowerShell the command is $LastExitCode. In the following examples, we'll
be using echo $? but substitute with the relevant command as appropriate.
If we run our code again and look up the exit code we'll see that is 0:
We can pass a different exit code to process.exit. Any non-zero code
indicates failure, and to indicate general failure we can use an exit code of 1
(technically this means "Incorrect function" on Windows but there's a
common understanding that 1 means general failure).
'use strict'
setInterval(() => {
console.log('this interval is keeping the process open')
}, 500)
setTimeout(() => {
console.log('exit after this')
process.exit(1)
}, 1750)
Now, if we check the exit code after running the process it should be 1:
The exit code can also be set independently be
assigning process.exitCode:
'use strict'
setInterval(() => {
console.log('this interval is keeping the process open')
process.exitCode = 1
}, 500)
setTimeout(() => {
console.log('exit after this')
process.exit()
}, 1750)
'use strict'
setInterval(() => {
console.log('this interval is keeping the process open')
process.exitCode = 1
}, 500)
setTimeout(() => {
console.log('exit after this')
process.exit()
}, 1750)
Process Info
Naturally the process object also contains information about the process,
we'll look at a few here:
The current working directory of the process
The platform on which the process is running
The Process ID
The environment variables that apply to the process
There are other more advanced things to explore, but see the Node.js
Documentation for a comprehensive overview.
Let's look at the first three bullet points in one code example:
'use strict'
console.log('Current Directory', process.cwd())
console.log('Process Platform', process.platform)
console.log('Process ID', process.pid)
The current working directory is whatever folder the process was executed
in. The process.chdir command can also change the current working
directory, in which case process.cwd() would output the new directory.
The process platform indicates the host Operating System. Depending on the
system it can be one of:
As we'll see in a future section the os module also has a platform function
(rather than property) which will return the same values for the same
systems as exist on process.platform.
Process Stats
The process object has methods which allow us to query resource usage.
We're going to look at
the process.uptime(), process.cpuUsage and process.memoryUsage functi
ons.
'use strict'
console.log('Process Uptime', process.uptime())
setTimeout(() => {
console.log('Process Uptime', process.uptime())
}, 1000)
Process uptime is the amount of seconds (with 9 decimal places) that the
process has been executing for. This is not to be confused with host machine
uptime, which we'll see in a future section can be determined using the os
module.
The process.cpuUsage function returns an object with two
properties: user and system. The user property represents time that the
Node process spent using the CPU. The system property represents time that
the kernel spent using the CPU due to activity triggered by the process. Both
properties contain microsecond (one millionth of a second) measurements:
'use strict'
const outputStats = () => {
const uptime = process.uptime()
const { user, system } = process.cpuUsage()
console.log(uptime, user, system, (user + system)/1000000)
}
outputStats()
setTimeout(() => {
outputStats()
const now = Date.now()
// make the CPU do some work:
while (Date.now() - now < 5000) {}
outputStats()
}, 500)
On the second line, we can observe that uptime jumps in the first column
from 0.026 to 0.536 because the setTimeout is 500 milliseconds (or 0.5
seconds). The extra 10 millisecond will be additional execution time of
outputting stats and setting up the timeout. However, on the same line the
CPU usage only increases by 0.006 seconds. This is because the process was
idling during that time, whereas the third line records that the process was
doing a lot of work. Just over 5 seconds, as intended.
One other observation we can make here is on the first line the total CPU
usage is greater than the uptime. This is because Node may use more than
one CPU core, which can multiply the CPU time used by however many cores
are used during that period.
'use strict'
const stats = [process.memoryUsage()]
let iterations = 5
while (iterations--) {
const arr = []
let i = 10000
// make the CPU do some work:
while (i--) {
arr.push({[Math.random()]: Math.random()})
}
stats.push(process.memoryUsage())
}
console.table(stats)
System Info
The os module can be used to get information about the Operating System.
Let's look at a couple API's we can use to find out useful information:
'use strict'
const os = require('os')
console.log('Hostname', os.hostname())
console.log('Home dir', os.homedir())
console.log('Temp dir', os.tmpdir())
This will display the hostname of the operating system, the logged in users
home directory and the location of the Operating System temporary
directory. The temporary folder is routinely cleared by the Operating System
so it's a great place to store throwaway files without the need to remove
them later.
There are two ways to identify the Operating System with the os module:
'use strict'
const os = require('os')
console.log('platform', os.platform())
console.log('type', os.type())
On macOS this outputs:
If executed on Windows the first line would be platform win32 and the
second line would be uname Windows_NT. On Linux the first line would
be platform linux and the second line would be uname Linux. However
there are many more lesser known systems with a uname command
that os.type() would output, too many to list here. See some examples
on Wikipedia.
System Stats
Operating System stats can also be gathered, let's look at:
Uptime
Free memory
Total memory
The os.uptime function returns the amount of time the system has been
running in seconds. The os.freemem and os.totalmem functions return
available system memory and total system memory in bytes:
'use strict'
const os = require('os')
setInterval(() => {
console.log('system uptime', os.uptime())
console.log('freemem', os.freemem())
console.log('totalmem', os.totalmem())
console.log()
}, 1000)
If we execute this code for five seconds and then press Ctrl + C we'll see
something like the following:
In this section we're going to zoom in on the exec and spawn methods
(including their synchronous forms). However, before we do that, let's briefly
cover the other listed methods
fork Method
The fork method is a specialization of the spawn method. By default, it will
spawn a new Node process of the currently executing JavaScript file
(although a different JavaScript file to execute can be supplied). It also sets
up Interprocess Communication (IPC) with the subprocess by default.
See fork Documentation to learn more.
'use strict'
const { execSync } = require('child_process')
const output = execSync(
`node -e "console.log('subprocess stdio output')"`
)
console.log(output.toString())
This should result in the following outcome:
The execSync method returns a buffer containing the output (from STDOUT)
of the command.
'use strict'
const { execSync } = require('child_process')
const cmd = process.platform === 'win32' ? 'dir' : 'ls'
const output = execSync(cmd)
console.log(output.toString())
'use strict'
const { execSync } = require('child_process')
const output = execSync(
`${process.execPath} -e "console.error('subprocess stdio
output')"`
)
console.log(output.toString())
If the subprocess exits with a non-zero exit code, the execSync function will
throw:
'use strict'
const { execSync } = require('child_process')
try {
execSync(`"${process.execPath}" -e "process.exit(1)"`)
} catch (err) {
console.error('CAUGHT ERROR:', err)
}
The error object that we log out in the catch block has some additional
properties. We can see that status is 1, this is because our subprocess
invoked process.exit(1). In a non-zero exit code scenario,
the stderr property of the error object can be very useful. The output array
indices correspond to the standard I/O file descriptors. Recall from the
previous chapter that the file descriptor of STDERR is 2. Ergo
the err.stderr property will contain the same buffer as err.output[2],
so err.stderr or err.output[2] can be used to discover any error
messages written to STDERR by the subprocess. In our case, the STDERR
buffer is empty.
'use strict'
const { execSync } = require('child_process')
try {
execSync(`"${process.execPath}" -e "throw Error('kaboom')"`)
} catch (err) {
console.error('CAUGHT ERROR:', err)
}
The exec method takes a shell command as a string and executes it the
same way as execSync. Unlike execSync the asynchronous exec function
splits the STDOUT and STDERR output by passing them as separate
arguments to the callback function:
'use strict'
const { exec } = require('child_process')
exec(
`"${process.execPath}" -e
"console.log('A');console.error('B')"`,
(err, stdout, stderr) => {
console.log('err', err)
console.log('subprocess stdout: ', stdout.toString())
console.log('subprocess stderr: ', stderr.toString())
}
)
'use strict'
const { exec } = require('child_process')
exec(
`"${process.execPath}" -e "console.log('A'); throw
Error('B')"`,
(err, stdout, stderr) => {
console.log('err', err)
console.log('subprocess stdout: ', stdout.toString())
console.log('subprocess stderr: ', stderr.toString())
}
)
'use strict'
const { spawnSync } = require('child_process')
const result = spawnSync(
process.execPath,
['-e', `console.log('subprocess stdio output')`]
)
console.log(result)
In this example process.execPath (e.g., the full path to the node binary) is
the first argument passed to spawnSync and the second argument is an
array. The first element in the array is the first flag: -e. There's a space
between the -e flag and the content that the flag instructs the node binary to
execute. Therefore that content has to be the second argument of the array.
Also notice the outer double quotes are removed. Executing this code results
in the following:
While the execSync function returns a buffer containing the child process
output, the spawnSync function returns an object containing information
about the process that was spawned. We assigned this to
the result constant and logged it out. This object contains the same
properties that are attached to the error object when execSync throws.
The result.stdout property (and result.output[1]) contains a buffer of
our processes STDOUT output, which should be 'subprocess stdio
output'. Let's find out by updating the console.log(result) line to:
console.log(result.stdout.toString())
Executing the updated code should verify that the result object contains
the expected STDOUT output:
'use strict'
const { spawnSync } = require('child_process')
const result = spawnSync(process.execPath, [`-e`,
`process.exit(1)`])
console.log(result)
Just as there are differences between execSync and spawnSync there are
differences between exec and spawn.
While exec accepts a callback, spawn does not. Both exec and spawn return
a ChildProcess instance however, which
has stdin, stdout and stderr streams and inherits
from EventEmitter allowing for exit code to be obtained after a close event
is emitted. See ChildProcess constructor Documentation for more details.
'use strict'
const { spawn } = require('child_process')
const sp = spawn(
process.execPath,
[`-e`, `console.log('subprocess stdio output')`]
)
sp.stdout.pipe(process.stdout)
const sp = spawn(
process.execPath,
[`-e`, `process.exit(1)`]
)
Running this altered example code will produce the following outcome:
There is no second line of output in our main process in this case as our code
change removed any output to STDOUT.
The exec command doesn't have to take a callback, and it also returns
a ChildProcess instance:
'use strict'
const { exec } = require('child_process')
const sp = exec(
`"${process.execPath}" -e "console.log('subprocess stdio
output')"`
)
console.log('pid is', sp.pid)
sp.stdout.pipe(process.stdout)
This leads to the exact same outcome as the equivalent spawn example:
We'll explore two options that can be passed which control the environment
of the child process: cwd and env.
We'll use spawn for our example but these options are universally the same
for all the child creation methods.
By default, the child process inherits the environment variables of the parent
process:
'use strict'
const { spawn } = require('child_process')
This example code creates a child process that executes node with the -
p flag so that it immediately prints process.env and exits.
The stdout stream of the child process is piped to the stdout of the parent
process. So when executed this code will output the environment variables
of the child process:
If we pass an options object with an env property the parent environment
variables will be overwritten:
'use strict'
sp.stdout.pipe(process.stdout)
We've modified the code so that an env object is passed via the options
object, which contains a single environment variable
named SUBPROCESS_SPECIFIC. When executed, the parent process will
output the child process' environment variables object, and they'll only
contain any system-defined default child-process environment variables and
what we passed via the env option:
'use strict'
const { IS_CHILD } = process.env
if (IS_CHILD) {
console.log('Subprocess cwd:', process.cwd())
console.log('env', process.env)
} else {
const { parse } = require('path')
const { root } = parse(process.cwd())
const { spawn } = require('child_process')
const sp = spawn(process.execPath, [__filename], {
cwd: root,
env: {IS_CHILD: '1'}
})
sp.stdout.pipe(process.stdout)
}
In this example, we're executing the same file twice. Once as a parent
process and then once as a child process. We spawn the child process by
passing __filename, inside the arguments array passed to spawn. This
means the child process will run node with the path to the current file.
In the child process, IS_CHILD will be truthy so the if branch will print out
the child processes' current working directory and environment variables.
Since the parent process pipes the sp.stdout stream to
the process.stdout stream executing this code will print out the current
working directory and environment variables of the child process, that we set
via the configuration options:
The cwd and env options can be set for any of the child process methods
discussed in the prior section, but there are other options that can be set as
well. To learn more
see spawn options, spawnSync options, exec options and execSync options in
the Node.js Documentation.
Child STDIO
So far we've covered that the asynchronous child creation methods
(exec and spawn) return a ChildProcess instance which
has stdin, stdout and stderr streams representing the I/O of the
subprocess.
'use strict'
const { spawn } = require('child_process')
const sp = spawn(
process.execPath,
[
'-e',
`console.error('err output');
process.stdin.pipe(process.stdout)`
],
{ stdio: ['pipe', 'pipe', 'pipe'] }
)
sp.stdout.pipe(process.stdout)
sp.stderr.pipe(process.stdout)
sp.stdin.write('this input will become output\n')
sp.stdin.end()
The process we are spawning is the node binary with the -e flag set to
evaluate code which pipes the child process STDIN to its STDOUT and then
outputs 'err output' (plus a newline) to STDERR using console.error.
In the parent process we pipe from the child process' STDOUT to the parent
process' STDOUT. We also pipe from the child process' STDERR to the parent
process' STDOUT. Note this is not a mistake, we are deliberately piping from
child STDERR to parent STDOUT. The subprocess STDIN stream (sp.stdin) is
a writable stream since it's for input. We write some input to it and then
call sp.stdin.end() which ends the input stream, allowing the child process
to exit which in turn allows the parent process to exit.
'use strict'
const { spawn } = require('child_process')
const sp = spawn(
process.execPath,
[
'-e',
`console.error('err output');
process.stdin.pipe(process.stdout)`
],
{ stdio: ['pipe', 'inherit', 'pipe'] }
)
sp.stderr.pipe(process.stdout)
sp.stdin.write('this input will become output\n')
sp.stdin.end()
The stdio option can also be passed a stream directly. In our example, we're
still piping the child process STDERR to the parent process STDOUT.
Since process.stdout is a stream, we can
set stdio[2] to process.stdout to achieve the same effect:
'use strict'
const { spawn } = require('child_process')
const sp = spawn(
process.execPath,
[
'-e',
`console.error('err output');
process.stdin.pipe(process.stdout)`
],
{ stdio: ['pipe', 'inherit', process.stdout] }
)
In our case we passed the process.stdout stream via stdio but any
writable stream could be passed in this situation, for instance a file stream, a
network socket or an HTTP response.
Let's imagine we want to filter out the STDERR output of the child process
instead of writing it to the parent process.stdout stream we can
change stdio[2] to 'ignore'. As the name implies this will ignore output
from the STDERR of the child process:
'use strict'
const { spawn } = require('child_process')
const sp = spawn(
process.execPath,
[
'-e',
`console.error('err output');
process.stdin.pipe(process.stdout)`
],
{ stdio: ['pipe', 'inherit', 'ignore'] }
)
To send input to a child process created with spawn or exec we can call
the write method of the return ChildProcess instance. For
the spawnSync and execSync functions an input option be used to achieve
the same:
'use strict'
const { spawnSync } = require('child_process')
spawnSync(
process.execPath,
[
'-e',
`console.error('err output');
process.stdin.pipe(process.stdout)`
],
{
input: 'this input will become output\n',
stdio: ['pipe', 'inherit', 'ignore']
}
)
This will create the same output as the previous example because we've also
set stdio[2] to 'ignore', thus STDERR output is ignored.
For the input option to work
for spawnSync and execSync the stdio[0] option has to be pipe, otherwise
the input option is ignored.
Assertions
An assertion checks a value for a given condition and throws if that condition
is not met. Assertions are the fundamental building block of unit and
integration testing. The core assert module exports a function that will
throw an AssertionError when the value passed to it is falsy (meaning that
the value can be coerced to false with !!val):
If the value passed to assert is truthy then it will not throw. This is the key
behavior of any assertion, if the condition is not met the assertion will throw
an error. The error throw is an instance of AssertionError (to learn more
see Class: assert.AssertionError).
Since the Node core assert module does not output anything for success
cases there is no assert.pass method as it would be behaviorally the same
as doing nothing.
There are third party libraries that provide alternative APIs and more
assertions, which we will explore briefly at the end of this section. However
this set of assertions (not the API itself but the actual assertion functionality
provided) tends to provide everything we need to write good tests. In fact,
the more esoteric the assertion the less useful it is long term. This is because
assertions provide a common language of expectations among developers.
So inventing or using more complex assertion abstractions that combine
basic level assertions reduces the communicability of test code among a
team of developers.
Generally when we check a value, we also want to check its type. Let's
imagine we're testing a function named add that takes two numbers and
adds them together. We can check that add(2, 2) is 4 with:
In this case if add doesn't return the number 4, the typeof check will throw
an AssertionError.
Since assert.strictEqual checks both value and type, using the triple
equals operator (===) if add does not return 4 as a number
an AssertionError will be thrown.
The assert module also exposes a strict object where namespaces for
non-strict methods are strict, so the above code could also be written as:
It's worth noting that assert.equal and other non-strict (i.e. coercive)
assertion methods are deprecated, which means they may one day be
removed from Node core. Therefore if using the Node core assert module,
best practice would be always to use assert.strict rather than assert, or
at least always use the strict methods (e.g. assert.strictEqual).
Let's take a look at an equivalent example using the fluid API provided by
the expect library.
const expect = require('expect')
const add = require('./get-add-from-somewhere.js')
expect(add(2, 2)).toStrictEqual(4)
With the expect assertion library, the value that we are asserting against is
passed to the expect function, which returns an object with assertion
methods that we can call to validate that value. In this case, we
call toStrictEqual to apply a strict equality check. For a coercive equality
check we could use expect(add(2, 2).toBe(4).
The difference
between assert.deepEqual and assert.deepStrictEqual (and assert.str
ict.deepEqual) is that the equality checks of primitive values (in this case
the id property value and the name.first and name.second strings) are
coercive, which means the following will also pass:
The error handling assertions (throws, ifError, rejects) are useful for
asserting that error situations occur for synchronous, callback-based and
promise-based APIs.
Notice that the invocation of add is wrapped inside another function. This is
because the assert.throws and assert.doesNotThrow methods have to be
passed a function, which they can then wrap and call to see if a throw occurs
or not. When executed the above code will pass, which is to say, no output
will occur and the process will exit.
For callback-based APIs, the assert.ifError will only pass if the value
passed to it is either null or undefined. Typically the err param is passed to
it, to ensure no errors occurred:
Finally for this section, let's consider asserting error or success states on a
promise-based API:
Notice that in all three cases we didn't actually check output. In the next
section, we'll use different test runners, with their own assertion APIs to fully
test the APIs we defined here.
Test Harnesses
While assertions on their own are a powerful tool, if one of the asserted
values fails to meet a condition an AssertionError is thrown, which causes
the process to crash. This means the results of any assertions after that point
are unknown, but any additional assertion failures might be important
information.
This is what test harnesses do. Broadly speaking we can group test
harnesses into two categories: pure libraries vs framework environments.
Pure Library
Pure library test harnesses provide a module, which is loaded into a file and
then used to group tests together. As we will see, pure libraries can be
executed directly with Node like any other code. This has the benefit of
easier debuggability and a shallower learning curve. We'll be looking at tap.
Framework Environment
'use strict'
module.exports = (a, b) => {
if (typeof a !== 'number' || typeof b !== 'number') {
throw Error('inputs must be numbers')
}
return a + b
}
'use strict'
module.exports = (url, cb) => {
setTimeout(() => {
if (url === 'ht
tp://error.com') cb(Error('network error'))
else cb(null, Buffer.from('some data'))
}, 300)
}
'use strict'
const { setTimeout: timeout } = require('timers/promises')
module.exports = async (url) => {
await timeout(300)
if (url === 'ht
tp://error.com') throw Error('network error')
return Buffer.from('some data')
}
In the folder with these files, if we run npm init -y, we'll be able to quickly
generate a package.json file which we'll need for installing test libraries:
We'll write tests for these three files with the tap library and later on we'll
convert over to the jest library for comparison.
On the first line the tap testing library is required, on the second we load
the add.js file from the directory above the test folder. We deconstruct
the test function from the tap library—this test function provides the ability
to describe and group a set of assertions together. We call the test function
twice, so we have two groups of assertions: one for testing input validation
and the other for testing expected output. The first argument passed
to test is a string describing that group of assertions, the second argument
is an async function. We use an async function because it returns a promise
and the test function will use the promise returned from the async function
to determine when the test has finished for that group of assertions. So when
the returned promise resolves, the test is done. Since we don't do anything
asynchronous, the promise essentially resolves at the end of the function,
which is perfect for our purposes here.
See Node Tap's Documentation to learn more about the tap libraries
assertion and to see where they differ from the Nodes assert module
functions.
We've run some tests for a synchronous API, so now let's test a callback-
based API. In a new file, test/req.test.js let's write the following:
'use strict'
const { test } = require('tap')
const req = require('../req')
test('handles network errors', ({ strictSame, end }) => {
req('ht tp://error.com', (err) => {
strictSame(err, Error('network error'))
end()
})
})
Again, we use the test function from tap to group assertions for different
scenarios. Here we're testing our faux network error scenario and then in the
second test group we're testing faux output. This time we don't use
an async function. Since we're using callbacks, it's much easier to call a final
callback to signify to the test function that we have finished testing.
In tap this comes in the form of the end function which is supplied via the
same assertions object passed to each function.
We can see that in both cases the end function is called within the callback
function supplied to the req function. If we don't call end when appropriate
the test will fail with a timeout error, but if we tried to use an async function
(without creating a promise that is in some way tied to the callback
mechanism) the returned promise would resolve before the callbacks
complete and so assertions would be attempting to run after that test group
has finished.
'use strict'
const { test } = require('tap')
const req = require('../req-prom')
We're using async functions again because we're dealing with promises,
the rejects assertion returns a promise (the resolution of which is
dependent on the promise passed to it), so we are sure to await that
promise. This makes sure that the async function passed to test does not
resolve (thus ending the test) before the promise passed to rejects has
rejected.
In the second test group we await the result of calling req and then apply
the same assertions to the result as we do in the callback-based tests.
There's no need for an error equivalent here, because if the promise
unexpectedly rejects, that will propagate to the async function passed to
the test function and the test harness will register that as an assertion
failure.
We can now run all tests again with the tap executable:
jest Framework: test/add.test.js
To round this section off we will convert the tests to use jest.
'use strict'
const add = require('../add')
test('throw when inputs are not numbers', async () => {
expect(() => add('5', '5')).toThrowError(
Error('inputs must be numbers')
)
expect(() => add(5, '5')).toThrowError(
Error('inputs must be numbers')
)
expect(() => add('5', 5)).toThrowError(
Error('inputs must be numbers')
)
expect(() => add({}, null)).toThrowError(
Error('inputs must be numbers')
)
})
test('adds two numbers', async () => {
expect(add(5, 5)).toStrictEqual(10)
expect(add(-5, 5)).toStrictEqual(0)
})
Notice that we still have a test function but it is not loaded from any
module. This function is made available implicitly by jest at execution time.
The same applies to expect, which we discussed as a module in the previous
section. However here it is injected as an implicitly available function, just
like the test function. This means that, unlike tap, we cannot run our tests
directly with node:
Instead we always have to use the jest executable to run tests:
The ability to run individual tests with node directly can help with
debuggability because there is nothing in between the developer and the
code. By default jest does not output code coverage but can be passed
the --coverage flag to do so.
'use strict'
const req = require('../req')
'use strict'
const req = require('../req-prom')
Now that all tests are converted we can run jest without any file names and
all the files in test folder will be executed with jest:
Configuring package.json
A final key piece when writing tests for a module, application or service is
making absolutely certain that the test field of the package.json file for
that project runs the correct command.
{
"name": "my-project",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC"
}
In the middle of the above JSON, we can see a "scripts" field. This contains
a JSON object, which contains a "test" field. By default the "test" field is
set up to generate an exit code of 1, to indicate failure. This is to indicate
that not having tests, or not configuring the "test" to a command that will
run tests is in fact a test failure.
Running the npm test command in the same folder as the package.json will
execute the shell command in the "test" field.
If npm test was executed against this package.json the following output
would occur:
In the last section our tests were converted to jest so let's modify
the "test" field of package.json like so:
{
"name": "my-project",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "jest --coverage"
},
"keywords": [],
"author": "",
"license": "ISC"
}
If we were to convert our tests back to tap, the package.json test field
could then be:
{
"name": "my-project",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "tap"
},
"keywords": [],
"author": "",
"license": "ISC"
}
Once tests were converted back to their tap versions, if we run npm
test with this package.json we should get output similar to the following: