0% found this document useful (0 votes)
44 views11 pages

ML in ChemEng

Paper que describe el estado del arte de la implementacion del machine learning y el artificial inteligence en la ingenieria quimica
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views11 pages

ML in ChemEng

Paper que describe el estado del arte de la implementacion del machine learning y el artificial inteligence en la ingenieria quimica
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Engineering xxx (xxxx) xxx

Contents lists available at ScienceDirect

Engineering
journal homepage: www.elsevier.com/locate/eng

Research
AI Energizes Process Manufacturing—Perspective

Machine Learning in Chemical Engineering: Strengths, Weaknesses,


Opportunities, and Threats
Maarten R. Dobbelaere a, Pieter P. Plehiers a, Ruben Van de Vijver a, Christian V. Stevens b,
Kevin M. Van Geem b,⇑
a
Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Ghent 9052, Belgium
b
SynBioC Research Group, Department of Green Chemistry and Technology, Faculty of Bioscience Engineering, Ghent University, Ghent 9000, Belgium

a r t i c l e i n f o a b s t r a c t

Article history: Chemical engineers rely on models for design, research, and daily decision-making, often with potentially
Received 16 October 2020 large financial and safety implications. Previous efforts a few decades ago to combine artificial intelli-
Revised 16 January 2021 gence and chemical engineering for modeling were unable to fulfill the expectations. In the last five years,
Accepted 22 March 2021
the increasing availability of data and computational resources has led to a resurgence in machine
Available online 29 July 2021
learning-based research. Many recent efforts have facilitated the roll-out of machine learning techniques
in the research field by developing large databases, benchmarks, and representations for chemical appli-
Keywords:
cations and new machine learning frameworks. Machine learning has significant advantages over tradi-
Artificial intelligence
Machine learning
tional modeling techniques, including flexibility, accuracy, and execution speed. These strengths also
Reaction engineering come with weaknesses, such as the lack of interpretability of these black-box models. The greatest oppor-
Process engineering tunities involve using machine learning in time-limited applications such as real-time optimization and
planning that require high accuracy and that can build on models with a self-learning ability to recognize
patterns, learn from data, and become more intelligent over time. The greatest threat in artificial intelli-
gence research today is inappropriate use because most chemical engineers have had limited training in
computer science and data analysis. Nevertheless, machine learning will definitely become a trustworthy
element in the modeling toolbox of chemical engineers.
Ó 2021 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and
Higher Education Press Limited Company. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction centuries. The Navier–Stokes equations [2,3], which describe vis-


cous fluid behavior, are one example of such a theoretical model.
In 130 years of chemical engineering, mathematical modeling However, many of these models cannot be solved analytically for
has been invaluable to engineers for understanding and designing realistic systems and require a considerable amount of computa-
chemical processes. Octave Levenspiel even stated that modeling tional power to solve numerically. This drawback has ensured that
stands out as the primary development in chemical engineering most engineers first use simple models to describe reality. An
[1]. Today, in a fast-moving world, there are more challenges than important historical—yet still relevant—example is Prandtl’s
ever. The ability to predict the outcomes of certain events is neces- boundary layer model [4]. In computational chemistry, scientists
sary, regardless of whether such events are related to the discovery and engineers are willing to give up some accuracy in favor of time.
and synthesis of active pharmaceutical ingredients for new dis- This willingness explains the popularity of density functional the-
eases or to improvements in process efficiencies to meet stricter ory, in comparison with higher-level-of-theory models. However,
environmental legislation. These events range from the reaction in many situations, higher accuracy is desired.
rate of a surface reaction or the selectivity of a reaction in a reactor, Decades of modeling, simulations, and experiments have pro-
to the control of the heat supply to that reactor. Predictions can be vided the chemical engineering community with a massive
made using theoretical models, which have been constructed for amount of data, which adds the option of making predictions from
experience as an extra modeling toolkit. Machine learning models
are statistical and mathematical models that can ‘‘learn” from
⇑ Corresponding author. experience and discover patterns in data without the need for
E-mail address: Kevin.VanGeem@UGent.be (K.M. Van Geem). explicit, rule-based programming. As a field of study, machine

https://doi.org/10.1016/j.eng.2021.03.019
2095-8099/Ó 2021 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and Higher Education Press Limited Company.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Please cite this article as: M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al., Machine Learning in Chemical Engineering: Strengths, Weaknesses, Oppor-
tunities, and Threats, Engineering, https://doi.org/10.1016/j.eng.2021.03.019
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

learning is a subset of artificial intelligence (AI). AI is the ability of 2. Machine learning ABCs
machines to perform tasks that are generally linked to the behavior
of intelligent beings, such as humans. As shown in Fig. 1, this field 2.1. The ‘‘A” in machine learning ABCs: Data
is not particularly new. The term ‘‘artificial intelligence” was
coined at Dartmouth College, USA in 1956, at a summer workshop A machine learning approach consists of three important links, as
for mathematicians who aimed at developing more cognizant illustrated in Fig. 2: data, representations, and models. The first link
machines. From that point on, it took more than a decade before in a machine learning approach is the data that is used to train the
the first attempts were made to apply AI in chemical engineering model. As will be discussed later, the data used also proves to be
[5]. In the 1980s, greater efforts were made in the field with the the weakest link in the machine learning process. Virtually any data-
use of rule-based expert systems, which are considered to be the set containing results from experiments, first-principles calcula-
simplest forms of AI. By that time, the field of machine learning tions, or complex simulation models can be used to train a model.
had started to grow, but in the chemical engineering community, However, because it is expensive to gather large amounts of accurate
with some exceptions, a lag of about 10 years was experienced in data, it is customary to make use of ‘‘big data” approaches—using
the growth of machine learning. A sudden rise in publications on large databases from various existing sources. Due to the cost of real
AI applications in chemical engineering in the 1990s can be experiments, these large quantities of data are usually obtained via
observed, with the adoption of clustering algorithms, genetic algo- fast simulations or text mining from patents and published work.
rithms, and—most successfully—artificial neural networks (ANNs). The increased digitalization of research provides the scientific com-
Nevertheless, the trend did not persist. Venkatasubramanian [6] munity with a plethora of open-source and commercial databases.
names the lack of powerful computing and the difficult task of cre- Examples of commonly used sources of chemical information are
ating the algorithms as possible causes for this loss in interest. Reaxys [7], SciFinder [8], and ChemSpace [9] for reaction chemistry
The past decade marked a breakthrough in deep learning, a sub- and properties; GDB-17 [10] for small drug-like molecules; and
set of machine learning that constructs ANNs to mimic the human National Institute of Standards and Technology (NIST) [11] and
brain. As mentioned above, ANNs gained popularity among chem- International Union of Pure and Applied Chemistry (IUPAC) [12]
ical engineers in the 1990s; however, the difference of the deep for molecular properties such as solubility. In addition, several
learning era is that deep learning provides the computational benchmarking datasets have been created to enable comparison
means to train neural networks with multiple layers—the so- between different machine learning models. Examples of these
called deep neural networks. These new developments triggered benchmarks are QM9 and Alchemy, for quantum chemical proper-
chemical engineers, as reflected by an exponential rise in publica- ties [13]; and ESOL [14] and FreeSolv [15], for solubilities. Before
tions on the topic. In the past, AI techniques could never become a using any dataset for machine learning-based modeling, several
standard tool in chemical engineering; thus, it can be asked steps should be undertaken to ensure that the used data is of high
whether this is finally the moment. In this perspective article, we enough quality. The general aspect of ensuring data quality—from
will first give an overview of the three major links in machine its generation to its storage—is known as data curation. More details
learning today, applied to chemical engineering. In what follows, about the necessity and consequences of data curation are discussed
the growing potential of machine learning in chemical engineering further on.
will be critically discussed; we will examine the pros and cons and Several differences concerning data usage exist between
list possible reasons for why machine learning in chemical engi- machine learning—and, more specifically, deep learning meth-
neering with remain ‘‘hot” or end up as a ‘‘not.” ods—and traditional modeling. First, ANNs learn from data and

Fig. 1. Timeline of artificial intelligence, machine learning, and deep learning. The evolution of publications about AI in chemical engineering shows that a rise in publications
is followed by a phase of disinterest. Currently, AI in chemical engineering is once again in a ‘‘hot” phase, and it is unclear whether or not the curve will soon flatten out.

2
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

Fig. 2. The three major links in machine learning for chemical engineering; every part has an impact on the eventual prediction performance and should be handled carefully.

train themselves, although doing so requires large amounts of data. neural network or another machine learning model. The first way
Therefore, training datasets generally contain tens to hundreds of to represent a molecule is by using a (set of) well-chosen molecular
thousands of data points. Second, the dataset is split into three descriptor(s), such as the molecular weight, dipole moment, or
instead of two sets: a training, validation, and test set. Both the dielectric constant [31–33]. Another way to generate a molecular
training and validation sets are used in the training phase, while feature vector is by starting from the 3D geometry. Coulomb matri-
only the data in the training set is used for fitting. The validation ces [34], bags of bonds [35], and histograms of distances, angles,
set is an independent dataset that provides an unbiased evaluation and dihedrals [36] are a few examples of geometry-based repre-
of the model fit during the training phase. The test set evaluates sentations. However, 3D coordinates or calculated properties are
the final model fit with unseen data and is generally the main indi- generally unavailable in many applications. In such cases, the rep-
cator of the model quality. resentation can be created starting from a molecular graph, result-
ing in so-called topology-based representations.
2.2. The ‘‘B” in machine learning ABCs: Representation In topology-based representations, only a line-based identifier
is available. Encoders exist that directly translate the line-based
A second important link in a machine learning method is how identifier into a representation with techniques from natural lan-
the data is represented in the model. Even when the data is already guage processing [37–41], but usually the line-based identifier is
in numerical format, the selection of the variables or features that transformed into a feature vector in a similar fashion to
will make up the model input can have a significant impact on the geometry-based representations [42–60]. This is done by adding
model performance. This process is known as feature selection and simple atom and bond features to the molecular graph and then
has been the topic of several studies [16–19]. Limiting the number transmitting the information iteratively between atoms and bonds.
of selected features may reduce the computational cost of both Circular fingerprints [42–46] based on the Morgan algorithm [61],
training and executing the model, while improving the overall such as the extended-connectivity fingerprint [62], were among
accuracy. This feature-selection process is of lesser importance in the first molecular representations for machine learning applica-
so-called deep learning methods, which are assumed to internally tions. These fingerprints are so-called fixed molecular representa-
select those features that are considered to be important [20]. tions because they do not change during the training of the
Then, an input layer that consists of basic process parameters machine learning model. They remain popular in drug design for
(e.g., pressure, temperature, residence time), feed characterizations rapidly predicting the physical, chemical, and biological properties
(e.g., distillation curves, feed compositions), or catalyst properties of candidate drugs [63]. Because a fixed representation vector rep-
(e.g., surface area, calcination time) is often sufficient [21–27]. resents a molecule by the same vector in every prediction task, this
However, the task of representing the data becomes far more chal- type of input layer seems to conflict with the definition of a deep
lenging in the case of non-numerical data, such as molecules and neural network, which is assumed to learn the important features
reactions. [64]. There is a growing tendency to focus on learning how to rep-
Chemical engineering tasks often involve molecules and/or resent a molecule [47,52] instead of on human-engineering the
chemical reactions. Creating suitable numerical representations feature vector, as it is assumed that better capturing of the features
of these data types is a developing field in itself. In computer appli- will lead to higher accuracy, with less data and at a lower compu-
cations, the molecular constitution is typically represented by a tational cost [53,58].
line-based identifier, such as the simplified molecular-input line- Learned molecular representations are created as part of the
entry system (SMILES) [28] or the (IUPAC) international chemical prediction model. Starting from several initial molecular fea-
identifiers (InChIs) [29], or as three-dimensional (3D) coordinates. tures—such as the heavy atoms, bond types, and ring features—a
Recently, self-referencing embedded strings (SELFIES) [30] have molecular representation is created that is updated during train-
been developed as a molecular string representation designed for ing. This choice also indicates that a molecule has different repre-
machine learning applications. The molecular information is trans- sentations depending on the prediction task. An extensive variety
lated into a feature vector or tensor that is used as input for a deep of learned topology-based representations [47–58] can be
3
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

described using the message-passing neural network framework between unsupervised and supervised learning techniques, with
reviewed by Gilmer et al. [59]. The weighted transfer of atom a non-exhaustive list of useful algorithms for a specific task. In
and bond information throughout the molecular graph is charac- unsupervised learning, the algorithm does not need any ‘‘solu-
teristic of message-passing neural networks. Many different repre- tions” or labels to learn; it will discover patterns by itself. Unsuper-
sentations exist, ranging in complexity, but it is important to note vised learning techniques have been used for various purposes in
that a single representation that works for all kind of molecular chemical engineering. PalkovitsR and PalkovitsS [86] used the k-
properties has not (yet) been developed [65]. For a more detailed means algorithm [87] for clustering catalysts based on their fea-
overview of the state of the art in representing molecules, readers tures and t-SNE for the visualization of high-dimensional catalyst
are referred to the review by David et al. [60]. representations. Not only used for catalysis, t-SNE is the preferred
Chemical reactions are more complex data types than mole- method for visualizing high-dimensional data; it has also been
cules. Similar to line-based molecular identifiers, reactions can be used in the context of fault diagnosis in chemical processes
identified by reaction SMILES [66] and reaction InChI (RInChI) [88,89] and for predicting reaction conditions [69,90]. PCA is
[67], whereas SMIRKS [66] identify reaction mechanisms. As for another algorithm for reducing dimensionality and has been used
molecules, chemical reactions should also be vectorized in order multiple times by chemical engineers for determining the features
to be useful in machine learning models. The most straightforward that account for the most variance in the training set [91–97]. In
method is to start from the molecular descriptors (e.g., finger- addition, PCA is used for outlier detection [93,98]. Other algorithms
prints) of the reagents and sum [68], subtract [50,69], or concate- used to detect anomalies include DBSCAN and long short-term
nate [70–72] them. Another approach is to learn a reaction memory (LSTM) [99,100]. Interested readers are referred to Géron’s
representation based on the atoms and bonds that take actively book [101] for a further introduction to machine learning
part in the reaction [73]. Reactions can also be kept as text (typi- algorithms.
cally InChI) and, with a neural machine translation, the organic When the dataset is labeled—that is, when the correct classifica-
reaction product is then considered to be a translation of the reac- tion of each data point is known—supervised classification meth-
tion products [58,74–78]. ods such as decision trees (and, by extension, random forests)
can be used [102,103]. Support vector machines are another possi-
ble supervised classification method [104]. Although support vec-
2.3. The ‘‘C” in machine learning ABCs: Model
tor machines are commonly used for classification purposes,
extensions have been made to allow regression via support vector
The final prerequisite for a machine learning method is a mod-
machines as well. Regression problems require supervised or
eling strategy. There is a wide variety of machine learning models
active learning methods, although, in principle, any supervised
to choose from. Models can be categorized in different ways, either
learning method can be incorporated into an active learning
by purpose (classification or regression) or by learning methodol-
approach. ANNs, and all their possible variations [105–113], are
ogy (unsupervised, supervised, active, or transfer learning). Gener-
the method that is most commonly associated with machine learn-
ally speaking, the term ‘‘machine learning” can be applied to any
ing. Depending on the application, one might choose feed-forward
method in which correlations within datasets are implicitly mod-
ANNs (for feature-based classification or regression), convolutional
eled [79,80]. Therefore, many techniques that are currently
neural networks (for image processing), or recurrent neural net-
referred to as machine learning methods were in use long before
works (for anomaly detection). A chemical engineer might encoun-
they were termed machine learning. Two such examples are
ter convolutional neural networks used for representing molecules
Gaussian mixture modeling and principal component analysis
(see Section 2.2) [42–60] and ANNs [32,33,47,91,114–117], support
(PCA), which originated in, respectively, the late 1800s [81] and
vector machines [32], or kernel ridge regression [36,118] for pre-
the early 1900s [82,83]. Both examples are now regarded as unsu-
dicting the properties of the representations. ANNs have been
pervised machine learning algorithms. Other similar unsupervised
applied as a black-box modeling tool for numerous applications
clustering methods are t-distributed stochastic neighbor embed-
in catalysis [23], chemical process control [119], and chemical pro-
ding (t-SNE) [84] and density-based spatial clustering of applica-
cess optimization [120]. A popular algorithm for classifying data
tions with noise (DBSCAN) [85]. Fig. 3 shows the difference
points when the labels are known is k-nearest neighbors, which
has been used, for example, for chemical process monitoring
[121,122] and clustering of catalysts [86,123,124].

3. Strengths

In this and the following sections, we give a detailed overview


of the strengths, weaknesses, opportunities, and threats in the
use of machine learning for chemical engineers. Fig. 4 summarizes
what is described in the next sections.
Machine learning techniques have gained popularity in chem-
istry and chemical engineering for revealing patterns in data that
human scientists are unable to discover. In contrast to physical
models, which rely explicitly on physical equations (resulting from
discovered patterns), machine learning models are not specifically
programmed to solve a certain problem. For classification prob-
lems, this implies that not a single explicitly defined decision func-
tion must be programmed. For regression problems, this implies
that no detailed model equations must be derived or parametrized
Fig. 3. Overview of unsupervised and supervised machine learning algorithms; a
non-exhaustive list of useful algorithms is included. GMM: gaussian mixture
[80]. These advantages allow efficient upscaling to large systems
modeling; LSTM: long short-term memory; t-SNE: t-distributed stochastic neighbor and datasets without the need for extensive computational
embedding. resources. An example is the current boom in predicting quantum
4
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

Fig. 4. Strengths, weaknesses, opportunities, and threats in using machine learning as a modeling tool in chemical engineering.

chemical properties using machine learning [32,33,35–37,39,40, the model. However, extracting physically meaningful explana-
47,49,50,52,55,65,68,71,73,115]. The usual ab initio methods often tions for certain behaviors is infeasible. Hence, regardless of their
require hours or days to calculate the properties of a single mole- speed and accuracy, machine learning models are a poor modeling
cule. Well-trained machine learning models can make accurate choice for explanatory studies.
predictions in a fraction of a second. Of course, other fast tech- This lack of interpretability contributes to the difficulty of
niques that can predict accurately have already been developed, designing a proper machine learning model. As in any model, a
but they are limited in application range compared with machine machine learning model can overfit or underfit the data, with the
learning models [125]. The inability to extrapolate is the major proper model being situated somewhere in between. The risk of
weakness of machine learning, but the application range can be overfitting is typically much greater than the risk of underfitting
extended quite easily by simply adding new data points. Active for machine learning models, and depends on the quality and
learning [126,127] makes it possible to expand the range with a quantity of the training data, and on the complexity of the model.
minimal amount of new data, which is ideal for cases in which Overfitting is an intrinsic property of the model structure and does
labeling is expensive (i.e., finding the true values of data points), not depend on the actual values of the hyperparameters—it can be
such as quantum chemical calculations [116] or chemical experi- compared to fitting a (noisy) linear dataset with a polynomial of
ments [72,128,129]. Furthermore, existing machine learning mod- very high order. In deep learning, overfitting usually manifests
els, such as ChemProp [47] and SchNet [130,131], are ready to use itself in the form of overtraining, which arises when the model is
and do not require experience. Machine learning in general has shown the same data too many times. This results in the model
become very accessible with packages such as scikit-learn [132] memorizing noise instead of capturing general patterns. Overtrain-
and TensorFlow [133], and frameworks like Keras [134] (now part ing can be identified by comparing the model performance on the
of TensorFlow [133]) or PyTorch [135], which restrict the training training data with its performance on the validation and test data-
of a deep learning model to just a few lines of code. Such packages sets. If the training performance is much better than the validation
and frameworks give scientists the opportunity to shift their focus performance, the model may be overtrained. Finding the number
to the physical meaning of their research instead of spending pre- of training epochs is often a difficult exercise. In order to avoid
cious time on developing high-level computer models. overfitting, a machine learning model requires a stopping criterion,
such as in other optimization problems. In traditional modeling,
4. Weaknesses where models typically involve at least some form of simplification
with respect to reality, this stopping criterion is typically based on
One of the main weaknesses of machine learning approaches is the change in performance on the training dataset, as achieving a
their black-box nature. Given a certain input, the approaches pro- high accuracy of the training data is the main challenge due to
vide an output. This situation is illustrated by Fig. 5. Based on the the simplifications. Achieving accuracy on the training dataset is
statistical performance of the model on a test dataset, certain typically not the issue for machine learning models; rather, the
statements can be made about the accuracy and reliability of the challenge mainly lies in achieving high accuracy on data the model
generated output. Detailed analysis of the model hyperparameters was not directly trained on. Therefore, the stopping criterion
(e.g., the number of nodes in an ANN) can be tedious, but can pro- should be based on the performance of the model on ‘‘unseen”
vide some insight into the correlations that have been learned by data—the so-called validation dataset. For rigorously testing the
5
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

Fig. 5. Unraveling the results from black-box models. A poor result is typically related to the training set used. When testing outside of the application range, a warning signal
should be raised. Good results require validation to understand what the model learns.

optimized dataset, a completely independent dataset—the test might not be useful is because, in general, only successful experi-
dataset—is required, as is also common practice in traditional mod- ments are reported, while failed experiments remain unpublished
eling approaches. [148]. Furthermore, experiments or operation conditions that seem
A final—but often critical—weakness in machine learning to be nonsense to a human chemical engineer are not performed,
approaches is the data itself that was used. If there are too many because the engineer has insight and scientific knowledge.
systematic errors in the dataset, the network will make systematic Machine learning algorithms, however, do not know these bound-
errors itself, in what is known as the ‘‘garbage in–garbage out” aries and not including such ‘‘trivial” data might lead to bad
(GIGO) principle [136]. Some forms or sources of error can be iden- predictions.
tified relatively easily, while others—once made—are much harder
to find. As in every statistical method, outliers may be present. A
model trained on a small dataset is more affected by some outliers 5. Opportunities
than a large dataset. This is why not only quality, but also quantity
matters in machine learning. One possible solution to systematic The many strengths of machine learning methods present vari-
errors is to manually remove these points from the dataset; it is ous application opportunities, and recent developments have pro-
also possible to use algorithms for anomaly detection, such as vided ways to mitigate some of the most important criticisms. The
PCA [69,92], t-SNE [137,138], DBSCAN [139,140], or recurrent exceptionally high execution speeds of almost any trained machine
neural networks (LSTM networks) [111,141,142]. Recently, self- learning method makes such methods well-suited for applications
learning unsupervised neural network-based methods for anomaly in which accuracy and speed within predefined system boundaries
detection [143] have been developed [144–146]. Next to simple are important. Examples of such applications include feed-forward
outliers, there is always the possibility that the data points are process control and high-frequency real-time optimization [149–
actually wrong. Such data points might be one sample from an 151]. While empirical models often lack the accuracy for these
experiment in which a measurement error was made, or from a applications, detailed fundamental models are rarely fast enough
whole set of experiments that were conducted incorrectly. An to avoid computational delays. Machine learning models, trained
example could be the results from a chemical analysis in which on a fundamental model, can provide similar accuracy, yet at the
the apparatus was not calibrated. Training on a set of systemati- computational cost of an empirical model. In this case, a model is
cally false data is especially dangerous since the model will per- trained on high-level data and tries to predict the difference
ceive the false trend as truth. Identifying such cases is possible between the empirical outcome and the true value [152,153].
through diligent scrutiny of the published data. This example illus- Unsupervised algorithms can be used in process control applica-
trates the importance of data curation, which ensures that the data tions for discovering outliers in real-time data [93]. The combina-
used is accurate, reliable, and reproducible. tion of more accurate, rapid prediction and reliable industrial data
Obviously, data can only be curated when it is available. offers opportunities for the creation of digital twins and better con-
Although decades of modeling, simulating, and experimenting trol, leading to more efficient chemical processes.
have provided the chemical engineering community with a mas- A similar observation can be made in multiscale modeling
sive amount of data, this data is often stored in research laborato- approaches, where phenomena at a variety of different scales are
ries or companies, and is hence not readily available. Even in a case modeled, resulting in a complex and strongly coupled set of equa-
where data is accessible, such as from an in-house database, the tions. The potential of machine learning in such applications
available data might not be completely useful for machine learn- strongly depends on the aim of the multiscale approach. If the
ing. The same applies to data extracted from research papers or aim is to gain fundamental insights into the lower scale phenom-
patents using text-mining techniques [147]. The reason such data ena, then machine learning is not advisable, due to its black-box
6
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

nature. However, if the smaller scales are incorporated into the interest is nearly gone. Next to irresponsible use of algorithms—
approach in order to obtain a more accurate model for larger scale and possibly more dangerous—is misinterpretation of the results.
phenomena, then machine learning could be used to replace the The black-box nature of the algorithms makes it difficult, and often
slow fundamental models for the smaller scales, without impact- nearly impossible, to understand why a certain result is obtained.
ing the interpretability of the larger scale phenomena. In addition, a model might give the correct outcome for the wrong
A final opportunity lies in providing an answer to one of the reasons [159]. Therefore, researchers should bear in mind an
main flaws of machine learning: its non-interpretability. The issue important rule from statistics when using machine learning: It is
of interpretable machine learning systems is not unique to about the correlations, not the causations.
chemical engineering problems—it is encountered in nearly any Another kind of unreasonable use of machine learning occurs
decision-making system [154–157]. An attempt has been made when the model leaves the application range it is created for.
in the field of catalysis to rationalize what exactly machine models The application range is determined by the training dataset and
learn [158]. This attempt, however, still does not provide any level is finite. When testing unknown data points, the researcher should
of direct interpretation of the model outcomes. Fig. 5 shows a check whether or not these points are within the application range.
workflow for explaining why a certain result is obtained. When When the points are outside of the range, it should be seen a warn-
the model outputs a good result, such as a chemical reaction pre- ing signal for the user that the model will perform poorly [92]. The
dictor giving the correct product, the model should only be trusted lower part of Fig. 5 depicts how the reason for obtaining a poor
after examining what the prediction is based on. A first step toward result is generally found by looking at the training set. Open-
interpretation of the model results is to quantify the individual source applications using clustering algorithms are available for
prediction uncertainties [159,160], as this gives an idea of the evaluating the data accuracy and its application range [180].
confidence the model has in its own decisions [115,161–164]. A last threat to applying machine learning in chemical engi-
One relatively straightforward way of doing so is via ensemble neering research is the growing educational gap when it comes
modeling. This methodology has been used for decades in weather to machine learning techniques. When applying computer and
forecasting and can be used in combination with nearly any model data science to chemistry and chemical engineering, it is important
type [165–167]. Several algorithms have also been created to to understand not only the tool that is used, but also the process it
determine how much certain input features influence the output is applied to. Therefore, simple training on how to use machine
[168], or to see which training points the model uses for a certain learning algorithms might become insufficient in the near future.
output [169,170]. When the results seem chemically or physically Instead, a good education on AI and statistical methods will
unreasonable, the model should be falsified instead of validated, by become vital in chemical engineering undergraduate programs.
finding adversarial examples [159]. Furthermore, the reason is usu- On the other hand, there is a need for more collaboration between
ally found in the dataset, with erroneous data or bias being present computer scientists and experts on the studied topic. Whereas
in the dataset [171,172]. undertrained researchers risk a wrong use of the computational
Another way of making machine learning models more inter- tools, computer and data scientists might obtain suboptimal
pretable is to include chemically relevant and well-founded infor- results when they are not fully familiar with the topic being stud-
mation in the models themselves. Interpretation will still require a ied. More interdisciplinary research and a symbiosis between
considerable amount of postprocessing, but—if human-readable machine learning experts and chemical experts might be a way
inputs are used and model architectures are not too complex—it to avoid a phase of disillusionment.
remains a feasible task. Very complex recurrent neural networks
using molecular fingerprints as input are nearly impossible to
interpret, as the model input is already difficult for a human to 7. Conclusions and perspectives
decipher. In risk management, the ‘‘as low as reasonably practica-
ble” (ALARP) principle is often applied [173]. Analogously, one In the past decade, machine learning has become a new tool in
could suggest an ‘‘as simple as reasonably possible” principle in the chemical engineer’s toolkit. Indeed, driven by its execution
order for machine learning models to be as interpretive as possible. speed, flexibility, and user-friendly applications, there is a strong,
growing interest in machine learning among chemical engineers.
On the flip side of this popularity is the risk of misusing machine
6. Threats learning or misinterpreting black-box results, which can poten-
tially lead to a distrust of machine learning within the chemical
The accessibility of machine learning models is both a major engineering community. The following three recommendations
strength and a major threat in research. While machine learning can help to improve the credibility of machine learning models
can be used by anyone with basic programming skills, it can also and turn them into an even more valuable and reliable modeling
be misused due to a lack of algorithmic knowledge. Today, a method.
plethora of machine learning algorithms are available, and a First, it is important to maintain easy and open access to data
tremendous number of combinations of parameters and hyper- and models within the community. High-quality data and open-
parameters is possible. Even for experienced users, machine learn- source models encourage researchers to use machine learning as
ing remains a reasoned trial-and-error method. Since researchers a tool and grant them the ability to focus on their topic rather than
are often unable to explain why one algorithm works while on programming and gathering data. Second, but related to the
another does not, some see machine learning as a type of modern first point, is the creation of interpretable models. Since machine
alchemy [174]. Moreover, the majority of published articles do not learning is already established in other research areas, new models
provide source code, or only a pseudocode, which makes it impos- for chemical applications are often inspired by existing algorithms.
sible to reproduce the work [175,176]. Although chemistry and Therefore, the field will benefit most from studying why a certain
chemical engineering do not face a reproducibility crisis as much output is generated from a given input, rather than from maintain-
as the social sciences do [177], skepticism might grow in the com- ing black boxes. The last recommendation is to invest in a profound
munity due to the increasing irreproducible use of machine learn- algorithmic education. Although chemical engineers typically have
ing in the field. In Gartner’s hype cycle [178], machine learning and very strong mathematical and modeling skills, understanding the
deep learning are beyond the peak of inflated expectations [179], computer science behind the graphical interface is a prerequisite
and there is a risk of entering a period of disillusionment where for any modeler. This should also make it possible to define the
7
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

application range of the model, which is crucial for an understand- [22] De Oliveira FM, de Carvalho LS, Teixeira LSG, Fontes CH, Lima KMG, Câmara
ABF, et al. Predicting cetane index, flash point, and content sulfur of diesel–
ing of when the model is interpolating and when it is extrapolat-
biodiesel blend using an artificial neural network model. Energy Fuels
ing. This last point is definitely the most crucial: Machine 2017;31(4):3913–20.
learning models should be credible models, which can only be [23] Li H, Zhang Z, Liu Z. Application of artificial neural networks for catalysis: a
achieved by being vigilant for times when the model is being used review. Catalysts 2017;7(10):306.
[24] Abdul Jameel AG, Van Oudenhoven V, Emwas AH, Sarathy SM. Predicting
outside of its training set. octane number using nuclear magnetic resonance spectroscopy and artificial
neural networks. Energy Fuels 2018;32(5):6309–29.
[25] Plehiers PP, Symoens SH, Amghizar I, Marin GB, Stevens CV, Van Geem KM.
Acknowledgements Artificial intelligence in steam cracking modeling: a deep learning algorithm
for detailed effluent prediction. Engineering 2019;5(6):1027–40.
[26] Cavalcanti FM, Schmal M, Giudici R, Brito Alves RM. A catalyst selection
The authors acknowledge funding from the European Research
method for hydrogen production through water–gas shift reaction using
Council (ERC) under the European Union’s Horizon 2020 research artificial neural networks. J Environ Manage 2019;237:585–94.
and innovation (818607). Pieter P. Plehiers and Ruben Van de [27] Hwangbo S, Al R, Sin G. An integrated framework for plant data-driven
process modeling using deep-learning with Monte-Carlo simulations.
Vijver acknowledge financial support, respectively, from a doctoral
Comput Chem Eng 2020;143:107071.
(1150817N) and a postdoctoral (3E013419) fellowship from the [28] Weininger D. SMILES, a chemical language and information system. 1.
Research Foundation—Flanders (FWO). Introduction to methodology and encoding rules. J Chem Inf Comput Sci
1988;28(1):31–6.
[29] Heller S, McNaught A, Stein S, Tchekhovskoi D, Pletnev I. InChI–the
Compliance with ethics guidelines worldwide chemical structure identifier standard. J Cheminform 2013;5(1):7.
[30] Krenn M, Häse F, Nigam A, Friederich P, Aspuru-Guzik A. Self-referencing
embedded strings (SELFIES): a 100% robust molecular string representation.
Maarten R. Dobbelaere, Pieter P. Plehiers, Ruben Van de Vijver, Mach Learn Sci Technol 2020;1(4):045024.
Christian V. Stevens, and Kevin M. Van Geem declare that they [31] Amar Y, Schweidtmann AM, Deutsch P, Cao L, Lapkin A. Machine learning and
molecular descriptors enable rational solvent selection in asymmetric
have no conflict of interest or financial conflicts to disclose. catalysis. Chem Sci 2019;10(27):6697–706.
[32] Yalamanchi KK, van Oudenhoven VCO, Tutino F, Monge-Palacios M, Alshehri
A, Gao X, et al. Machine learning to predict standard enthalpy of formation of
References hydrocarbons. J Phys Chem A 2019;123(38):8305–13.
[33] Yalamanchi KK, Monge-Palacios M, van Oudenhoven VCO, Gao X, Sarathy SM.
[1] Levenspiel O. Modeling in chemical engineering. Chem Eng Sci 2002;57(22– Data science approach to estimate enthalpy of formation of cyclic
23):4691–6. hydrocarbons. J Phys Chem A 2020;124(31):6270–6.
[2] Stokes GG. On the steady motion of incompressible fluids. In: Mathematical [34] Rupp M, Tkatchenko A, Müller KR, von Lilienfeld OA. Fast and accurate
and physical papers. Cambridge: Cambridge University Press; 2009. p. 1–16. modeling of molecular atomization energies with machine learning. Phys Rev
French. Lett 2012;108(5):058301.
[3] Navier CL. Memoire sur les lois du mouvement des fluides. Mem Acad Sci Inst [35] Hansen K, Biegler F, Ramakrishnan R, Pronobis W, von Lilienfeld OA, Müller
Fr 1827;6:389–440. French. KR, et al. Machine learning predictions of molecular properties: accurate
[4] Prandtl L. Über flussigkeitsbewegung bei sehr kleiner reibung. In: Riegels FW, many-body potentials and nonlocality in chemical space. J Phys Chem Lett
editor. Ludwig prandtl gesammelte abhandlungen. Berlin: Springer; 1904. p. 2015;6(12):2326–31.
484–91. German. [36] Faber FA, Hutchison L, Huang B, Gilmer J, Schoenholz SS, Dahl GE, et al.
[5] Siirola JJ, Powers GJ, Rudd DF. Synthesis of system designs: III. toward a Prediction errors of molecular machine learning models lower than hybrid
process concept generator. AIChE J 1971;17(3):677–82. DFT error. J Chem Theory Comput 2017;13(11):5255–64.
[6] Venkatasubramanian V. The promise of artificial intelligence in chemical [37] Gómez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, Sánchez-
engineering: is it here, finally? AIChE J 2019;65(2):466–78. Lengeling B, Sheberla D, et al. Automatic chemical design using a data-driven
[7] Reaxys [Internet]. Amsterdam: Elsevier; c2021 [cited 2021 Jan 4]. Available continuous representation of molecules. ACS Cent Sci 2018;4(2):268–76.
from: https://www.elsevier.com/solutions/reaxys. [38] Liu S, Demirel MF, Liang Y. N-gram graph: simple unsupervised
[8] CAS SciFinder [Internet]. Columbus: American Chemical Society; c2021 [cited representation for graphs, with applications to molecules. In: Advances in
2021 Jan 4]. Available from: https://www.cas.org/products/scifinder. neural information processing systems 32. 2019 Dec 8–14; Vancouver, BC,
[9] ChemSpace [Internet]. Monmouth Junction: Chemspace US Inc.; c2021 [cited Canada. New York: Neural Information Processing Systems Foundation, Inc.;
2021 Jan 4]. Available from: https://chem-space.com/about. 2019.
[10] Ruddigkeit L, van Deursen R, Blum LC, Reymond JL. Enumeration of 166 [39] Wang S, Guo Y, Wang Y, Sun H, Huang J. SMILES-BERT: large scale
billion organic small molecules in the chemical universe database GDB-17. J unsupervised pre-training for molecular property prediction. In:
Chem Inf Model 2012;52(11):2864–75. Proceedings of the 10th ACM International Conference on Bioinformatics,
[11] NIST Chemistry WebBook. Washington, DC: National Institute of Standards Computational Biology and Health Informatics; 2019 Sep 7–10; Niagara Falls,
and Technology, US Department of Commerce. c2018 [cited 2021 Jan 4]. NY, USA. New York: IEEE; 2019. p. 429–36.
Available from: https://webbook.nist.gov/chemistry/. [40] Chithrananda S, Grand G, Ramsundar B. ChemBERTa: large-scale self-
[12] Pettit LD. The IUPAC stability constants database. Chem Int 2006;28(5):14–5. supervised pretraining for molecular property prediction. 2020.
[13] Chen G, Chen P, Hsieh CY, Lee CK, Liao B, Liao R, et al. Alchemy: a quantum arXiv:2010.09885.
chemistry dataset for benchmarking AI models. 2019. arXiv:1906.09427. [41] Fabian B, Edlich T, Gaspar H, Ahmed M. Molecular representation learning
[14] Delaney JS. ESOL: estimating aqueous solubility directly from molecular with language models and domain-relevant auxiliary tasks. 2020.
structure. J Chem Inf Comput Sci 2004;44(3):1000–5. arXiv:2011.13230.
[15] Mobley DL, Guthrie JP. FreeSolv: a database of experimental and calculated [42] Glem RC, Bender A, Arnby CH, Carlsson L, Boyer S, Smith J. Circular
hydration free energies, with input files. J Comput Aided Mol Des 2014;28 fingerprints: flexible molecular descriptors with applications from physical
(7):711–20. chemistry to ADME. IDrugs 2006;9(3):199–204.
[16] Hall MA. Correlation-based feature selection for machine learning [43] Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, et al.
[dissertation]. Hamilton: The University of Waikato; 1999. QSAR modeling: where have you been? where are you going to? J Med Chem
[17] Khalid S, Khalil T, Nasreen SA. A survey of feature selection and feature 2014;57(12):4977–5010.
extraction techniques in machine learning. In: Proceedings of 2014 Science [44] Sumudu PL, Steffen L. Computational methods in drug discovery. Beilstein J
and Information Conference; 2014 Aug 27–29; London, UK. New York: IEEE; Org Chem 2016;12:2694–718.
2014. [45] Unterthiner T, Mayr A, Klambauer G, Steijaert M, Wegner J, Ceulemans H,
[18] Xue B, Zhang M, Browne WN, Yao X. A survey on evolutionary computation et al. Deep learning as an opportunity in virtual screening. In: Proceedings of
approaches to feature selection. IEEE Trans Evol Comput 2016;20(4):606–26. Workshop on Machine Learning for Clinical Data Analysis, Healthcare and
[19] Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: a new Genomics (NIPS2014); 2014 Dec 8–13; Montreal, QC, Canada. Linz: Johannes
perspective. Neurocomputing 2018;300:70–9. Kepler University Linz; 2015.
[20] Szegedy C, Toshev A, Erhan D. Deep neural networks for object detection. In: [46] Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H,
Proceedings of NIPS 2013-Twenty-Seventh Annual Conference on Neural et al. Large-scale comparison of machine learning methods for drug target
Information Processing Systems Conference. 2013 Dec 9–12; Nevada, CA, prediction on ChEMBL. Chem Sci 2018;9(24):5441–51.
USA. New York: Neural Information Processing Systems Foundation, Inc.; [47] Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, et al. Analyzing learned
2013. molecular representations for property prediction. J Chem Inf Model 2019;59
[21] Bassam A, Conde-Gutierrez RA, Castillo J, Laredo G, Hernandez JA. Direct (8):3370–88.
neural network modeling for separation of linear and branched paraffins by [48] Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarelli R, Aspure A, Adams RP,
adsorption process for gasoline octane number improvement. Fuel et al. Convolutional networks on graphs for learning molecular fingerprints.
2014;124:158–67. In: Proceedings of the 28th International Conference on Neural Information

8
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

Processing Systems; 2015 Dec 8–12; Bali, Indonesia. Cambridge: MIR Press; [78] Schwaller P, Laino T, Gaudin T, Bolgar P, Hunter CA, Bekas C, et al. Molecular
2015. p. 2224–32. transformer: a model for uncertainty-calibrated chemical reaction prediction.
[49] Coley CW, Barzilay R, Green WH, Jaakkola TS, Jensen KF. Convolutional ACS Cent Sci 2019;5(9):1572–83.
embedding of attributed molecular graphs for physical property prediction. J [79] Michalski RS, Carbonell JG, Mitchell TM. A comparative review of selected
Chem Inf Model 2017;57(8):1757–72. methods for learning from examples. Mach Learn 2013;1:41–82.
[50] Coley CW, Jin W, Rogers L, Jamison TF, Jaakkola TS, Green WH, et al. A graph- [80] Dey A. Machine learning algorithms: a review. Int J Comput Sci Inf Technol
convolutional neural network model for the prediction of chemical reactivity. 2016;7(3):1174–9.
Chem Sci 2018;10(2):370–7. [81] Pearson K. Contributions to the mathematical theory of evolution. Philos
[51] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on Trans R Soc Lond A 1894;185:71–110.
graphs with fast localized spectral filtering. In: Proceedings of the 30th [82] Hotelling H. Analysis of a complex of statistical variables into principal
International Conference on Neural Information Processing Systems; 2016 components. J Educ Psychol 1933;24(6):417–41.
Dec 5–10; Barcelona, Spain. New York: Curran Associates, Inc; 2016. p. [83] Pearson K. On lines and planes of closest fit to systems of points in space.
3844–52. Lond Edinb Dublin Philos Mag J Sci 1901;2(11):559–72.
[52] Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, et al. [84] Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res
MoleculeNet: a benchmark for molecular machine learning. Chem Sci 2017;9 2008;9:2579–605.
(2):513–30. [85] Ester M, Kriegel HP, Sander J, Xu XW. A density-based algorithm for
[53] Kearnes S, McCloskey K, Berndl M, Pande V, Riley P. Molecular graph discovering clusters in large spatial databases with noise. In: Proceedings of
convolutions: moving beyond fingerprints. J Comput Aided Mol Des 2016;30 the Second International Conference on Knowledge Discovery and Data
(8):595–608. Mining. 1996 Aug 2–4; Portland, OR, USA. New York: AAAI Press; 1996.
[54] Battaglia P, Pascanu R, Lai M, Rezende DJ, Koray K. Interaction networks for p. 226–31.
learning about objects, relations and physics. In: Proceedings of the 30th [86] Palkovits R, Palkovits S. Using artificial intelligence to forecast water
International Conference on Neural Information Processing Systems; 2016 oxidation catalysts. ACS Catal 2019;9(9):8383–7.
Dec 5–10; Barcelona, Spain. New York: Curran Associates, Inc.; 2016. p. [87] Likas A, Vlassis N, Verbeek J. The global k-means clustering algorithm. Pattern
4502–10. Recognit 2003;36(2):451–61.
[55] Schütt KT, Arbabzadah F, Chmiela S, Müller KR, Tkatchenko A. Quantum- [88] Tang J, Yan X. Neural network modeling relationship between inputs and
chemical insights from deep tensor neural networks. Nat Commun 2017;8 state mapping plane obtained by FDA–t-SNE for visual industrial process
(1):13890. monitoring. Appl Soft Comput 2017;60:577–90.
[56] Jørgensen PB, Jacobsen KW, Schmidt MN. Neural message passing with edge [89] Zheng S, Zhao J. A new unsupervised data mining method based on the
updates for predicting properties of molecules and materials. In: Proceedings stacked autoencoder for chemical process fault diagnosis. Comput Chem Eng
of 32nd Conference on Neural Information Processing Systems; 2018 Dec 3– 2020;135:106755.
8; Montreal, QC, Canada. New York: Neural Information Processing Systems [90] Gao H, Struble TJ, Coley CW, Wang Y, Green WH, Jensen KF. Using machine
Foundation, Inc.; 2018. learning to predict suitable conditions for organic reactions. ACS Cent Sci
[57] Li Y, Tarlow D, Brockschmidt M, Zemel R. Gated graph sequence neural 2018;4(11):1465–76.
network. 2017. arXiv1511.05493. [91] Vermeire FH, Green WH. Transfer learning for solvation free energies: from
[58] Winter R, Montanari F, Noé F, Clevert DA. Learning continuous and data- quantum chemistry to experiments. Chem Eng J 2020;418:129307.
driven molecular descriptors by translating equivalent chemical [92] Pyl SP, Van Geem KM, Reyniers MF, Marin GB. Molecular reconstruction of
representations. Chem Sci 2018;10(6):1692–701. complex hydrocarbon mixtures: an application of principal component
[59] Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing analysis. AIChE J 2010;56(12):3174–88.
for quantum chemistry. In: Proceedings of the 34th International Conference [93] Thombre M, Mdoe Z, Jäschke J. Data-driven robust optimal operation of
on Machine Learning; 2017 Aug 6–11; Sydney, NSW, Australia. New thermal energy storage in industrial clusters. Processes 2020;8(2):194.
York: JMLR.org; 2017. p. 1263–72. [94] Lee JM, Yoo C, Choi SW, Vanrolleghem PA, Lee IB. Nonlinear process
[60] David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI- monitoring using kernel principal component analysis. Chem Eng Sci
driven drug discovery: a review and practical guide. J Cheminform 2020;12 2004;59(1):223–34.
(1):56. [95] Choi SW, Park JH, Lee IB. Process monitoring using a Gaussian mixture model
[61] Morgan HL. The generation of a unique machine description for chemical via principal component analysis and discriminant analysis. Comput Chem
structures—a technique developed at chemical abstracts service. J Chem Doc Eng 2004;28(8):1377–87.
1965;5(2):107–13. [96] Ning C, You F. Data-driven decision making under uncertainty integrating
[62] Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model robust optimization with principal component analysis and kernel smoothing
2010;50(5):742–54. methods. Comput Chem Eng 2018;112:190–210.
[63] Pattanaik L, Coley CW. Molecular representation: going long on fingerprints. [97] Kano M, Hasebe S, Hashimoto I, Ohno H. A new multivariate statistical
Chem 2020;6(6):1204–7. process monitoring method using principal component analysis. Comput
[64] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44. Chem Eng 2001;25(7–8):1103–13.
[65] Von Lilienfeld OA. First principles view on chemical compound space: gaining [98] Chiang LH, Pell RJ, Seasholtz MB. Exploring process data with the use of
rigorous atomistic control of molecular properties. Int J Quantum Chem robust outlier detection algorithms. J Process Contr 2003;13(5):437–49.
2013;113(12):1676–89. [99] Zhang X, Zou Y, Li S, Xu S. A weighted auto regressive LSTM based approach
[66] James CA. Daylight theory manual [internet]. Laguna Niguel: Daylight for chemical processes modeling. Neurocomputing 2019;367:64–74.
Chemical Information Systems, Inc.; c1997–2019 [cited 2021 Jan 4]. [100] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput
Available from: http://www.daylight.com/dayhtml/doc/theory/. 1997;9(8):1735–80.
[67] Grethe G, Blanke G, Kraut H, Goodman JM. International chemical identifier [101] Géron A. Hands-on machine learning with Scikit-Learn, Keras, and
for reactions (RInChI). J Cheminform 2018;10(1):22. TensorFlow: concepts, tools, and techniques to build intelligent
[68] Segler MHS, Waller MP. Neural-symbolic machine learning for retrosynthesis systems. Sebastopol: O’Reilly Media; 2019.
and reaction prediction. Chemistry 2017;23(25):5966–71. [102] Safavian SR, Landgrebe D. A survey of decision tree classifier methodology.
[69] Plehiers PP, Coley CW, Gao H, Vermeire FH, Dobbelaere MR, Stevens CV, et al. IEEE Trans Syst Man Cybern 1991;21(3):660–74.
Artificial intelligence for computer-aided synthesis in flow: analysis and [103] Ho TK. Random decision forests. In: Proceedings of 3rd International
selection of reaction components. Front Chem Eng 2020;2:5. Conference on Document Analysis and Recognition; 1995 Aug 14–16;
[70] Sanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using Montreal, QC, Canada. New York: IEEE; 1995. p. 278–82.
machine learning: generative models for matter engineering. Science [104] Vapnik V. The support vector method of function estimation. In: Suykens
2018;361(6400):360–5. JAK, Vandewalle J, editors. Nonlinear modeling. Boston: Springer; 1998.
[71] Wei JN, Duvenaud D, Aspuru-Guzik A. Neural networks for the prediction of p. 55–85.
organic chemistry reactions. ACS Cent Sci 2016;2(10):725–32. [105] Matsugu M, Mori K, Mitari Y, Kaneda Y. Subject independent facial expression
[72] Eyke NS, Green WH, Jensen KF. Iterative experimental design based on active recognition with robust face detection using a convolutional neural network.
machine learning reduces the experimental burden associated with reaction Neural Netw 2003;16(5–6):555–9.
screening. React Chem Eng 2020;5(10):1963–72. [106] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep
[73] Coley CW, Barzilay R, Jaakkola TS, Green WH, Jensen KF. Prediction of organic convolutional neural networks. Commun ACM 2017;60(6):84–90.
reaction outcomes using machine learning. ACS Cent Sci 2017;3(5):434–43. [107] Shiffman D. Neural networks. In: Fry S, editor. The nature of code. Boston:
[74] Nam J, Kim J. Linking the neural machine translation and the prediction of Free Software Foundation; 2012. p. 444–80.
organic chemistry reactions. 2016. arXiv:1612.09529. [108] Hopfield JJ. Artificial neural networks. IEEE Circuits Devices Mag 1988;4
[75] Schwaller P, Gaudin T, Lányi D, Bekas C, Laino T. ‘‘Found in translation”: (5):3–10.
predicting outcomes of complex organic chemistry reactions using neural [109] Bontemps L, Cao VL, McDermott J, Le-Khac N. Collective anomaly detection
sequence-to-sequence models. Chem Sci 2018;9(28):6091–8. based on long short-term memory recurrent neural networks. In:
[76] Duan H, Wang L, Zhang C, Guo L, Li J. Retrosynthesis with attention-based Proceedings of International Conference on Future Data and Security
NMT model and chemical analysis of ‘‘wrong” predictions. RSC Adv 2020;10 Engineering; 2016 Nov 23–25; Can Tho City, Vietnam. Cham: Springer
(3):1371–8. International Publishing; 2016.
[77] Lee AA, Yang Q, Sresht V, Bolgar P, Hou X, Klug-McLeod JL, et al. Molecular [110] Brotherton T, Johnson T. Anomaly detection for advanced military aircraft
transformer unifies reaction prediction and retrosynthesis across pharma using neural networks. In: Proceedings of 2001 IEEE Aerospace Conference;
chemical space. Chem Commun 2019;55(81):12152–5. 2001 Mar 10–17; Big Sky, MT, USA. New York: IEEE; 2001.

9
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

[111] Malhotra P, Vig L, Shroff G, Agarwal P. Long short term memory networks for on Innovations in Intelligent Systems and Applications; 2011 Jun 15–18;
anomaly detection in time series. In: Proceedings of 23rd European Istanbul, Turkey. New York: IEEE; 2016. p. 91–5.
Symposium on Artificial Neural Networks, Computational Intelligence and [140] Cassisi C, Ferro A, Giugno R, Pigola G, Pulvirenti A. Enhancing density-based
Machine Learning, ESANN 2015. 2015 Apr 22–24; Bruges, clustering: parameter reduction and outlier detection. Inf Syst 2013;38
Belgium. Wallonie: i6doc; 2015. (3):317–30.
[112] Chalapathy R, Menon AK, Chawla S. Anomaly detection using one-class [141] Fernando T, Denman S, Sridharan S, Fookes C. Soft + hardwired attention: an
neural networks. 2018. arXiv:1802.06360. LSTM framework for human trajectory prediction and abnormal event
[113] Zhou S, Shen W, Zeng D, Fang M, Wei Y, Zhang Z. Spatial–temporal detection. Neural Netw 2018;108:466–78.
convolutional neural networks for anomaly detection and localization in [142] Filonov P, Lavrentyev A, Vorontsov A. Multivariate industrial time series with
crowded scenes. Signal Process Image Commun 2016;47:358–68. cyber-attack simulation: fault detection using an LSTM-based predictive data
[114] Grambow CA, Pattanaik L, Green WH. Deep learning of activation energies. J model. 2016. arXiv:1612.06676.
Phys Chem Lett 2020;11(8):2992–7. [143] Chandola V, Banerjee A, Kumar V. Anomaly detection: a survey. ACM Comput
[115] Scalia G, Grambow CA, Pernici B, Li YP, Green WH. Evaluating scalable Surv 2009;41(3):1–58.
uncertainty estimation methods for deep learning-based molecular property [144] Ahmad S, Lavin A, Purdy S, Agha Z. Unsupervised real-time anomaly
prediction. J Chem Inf Model 2020;60(6):2697–717. detection for streaming data. Neurocomputing 2017;262:134–47.
[116] Li YP, Han K, Grambow CA, Green WH. Self-evolving machine: a continuously [145] Amini M, Jalili R, Shahriari HR. RT-UNNID: a practical solution to real-time
improving model for molecular thermochemistry. J Phys Chem A 2019;123 network-based intrusion detection using unsupervised neural networks.
(10):2142–52. Comput Secur 2006;25(6):459–68.
[117] Grambow CA, Li YP, Green WH. Accurate thermochemistry with small data [146] Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G.
sets: a bond additivity correction and transfer learning approach. J Phys Unsupervised anomaly detection with generative adversarial networks
Chem A 2019;123(27):5826–35. to guide marker discovery. In: Proceedings of International
[118] Christensen AS, Bratholm LA, Faber FA, Anatole von Lilienfeld O. FCHL Conference on Information Processing in Medical Imaging 2017; 2017
revisited: faster and more accurate quantum machine learning. J Chem Phys Jun 25–30; Boone, KY, USA. Cham: Springer International Publishing;
2020;152(4):044107. 2017. p. 146–57.
[119] Azlan Hussain M. Review of the applications of neural networks in chemical [147] Schneider N, Lowe DM, Sayle RA, Tarselli MA, Landrum GA. Big data from
process control–simulation and online implementation. Artif Intell Eng pharmaceutical patents: a computational analysis of medicinal chemists’
1999;13(1):55–68. bread and butter. J Med Chem 2016;59(9):4385–402.
[120] Schweidtmann AM, Mitsos A. Deterministic global optimization with [148] Raccuglia P, Elbert KC, Adler PDF, Falk C, Wenny MB, Mollo A, et al. Machine-
artificial neural networks embedded. J Optim Theory Appl 2019;180 learning-assisted materials discovery using failed experiments. Nature
(3):925–48. 2016;533(7601):73–6.
[121] Zhu W, Sun W, Romagnoli J. Adaptive k-nearest-neighbor method for process [149] Wu Z, Rincon D, Christofides PD. Real-time adaptive machine-learning-based
monitoring. Ind Eng Chem Res 2018;57(7):2574–86. predictive control of nonlinear processes. Ind Eng Chem Res 2020;59
[122] Yan S, Yan X. Using labeled autoencoder to supervise neural network (6):2275–90.
combined with k-nearest neighbor for visual industrial process monitoring. [150] Zhang Z, Wu Z, Rincon D, Christofides P. Real-time optimization and control
Ind Eng Chem Res 2019;58(23):9952–8. of nonlinear processes using machine learning. Mathematics 2019;7(10):890.
[123] Walker E, Kammeraad J, Goetz J, Robo MT, Tewari A, Zimmerman PM. [151] Powell BKM, Machalek D, Quah T. Real-time optimization using
Learning to predict reaction conditions: relationships between solvent, reinforcement learning. Comput Chem Eng 2020;143:107077.
molecular structure, and catalyst. J Chem Inf Model 2019;59 [152] Ramakrishnan R, Dral PO, Rupp M, von Lilienfeld OA. Big data meets quantum
(9):3645–54. chemistry approximations: the D-machine learning approach. J Chem Theory
[124] Zahrt AF, Henle JJ, Rose BT, Wang Y, Darrow WT, Denmark SE. Prediction of Comput 2015;11(5):2087–96.
higher-selectivity catalysts by computer-driven workflow and machine [153] Bikmukhametov T, Jäschke J. Combining machine learning and process
learning. Science 2019;363(6424):eaau5631. engineering physics towards enhanced accuracy and explainability of data-
[125] Han K, Jamal A, Grambow CA, Buras ZJ, Green WH. An extended group driven models. Comput Chem Eng 2020;138:106834.
additivity method for polycyclic thermochemistry estimation. Int J Chem [154] Gunning D, Aha DW. DARPA’s explainable artificial intelligence program. AI
Kinet 2018;50(4):294–303. Mag 2019;40(2):44–58.
[126] Settles B. From theories to queries: active learning in practice. JMLR [155] Abdul A, Vermeulen J, Wang D, Lim BY, Kankanhali M. Trends and trajectories
2011;16:1–18. for explainable, accountable and intelligible systems: an HCI research
[127] Settles B. Active learning literature survey. Computer Sciences Technical agenda. In: Proceedings of the 2018 CHI Conference on Human Factors in
Report 1648. Madison: University of Wisconsin-Madison; 2009. Computing Systems; 2018 Apr 21; Montreal, QC, Canada. New
[128] Clayton AD, Schweidtmann AM, Clemens G, Manson JA, Taylor CJ, Niño CG, York: Association for Computing Machinery; 2018.
et al. Automated self-optimisation of multi-step reaction and separation [156] Lepri B, Oliver N, Letouzé E, Pentland A, Vinck P. Fair, transparent, and
processes using machine learning. Chem Eng J 2020;384:123340. accountable algorithmic decision-making processes. Philos Technol 2018;31
[129] Zhang C, Amar Y, Cao L, Lapkin AA. Solvent selection for mitsunobu reaction (4):611–27.
driven by an active learning surrogate model. Org Process Res Dev 2020;24 [157] Wachter S, Mittelstadt B, Floridi L. Transparent, explainable, and accountable
(12):2864–73. AI for robotics. Sci Robot 2017;2(6):eaan6080.
[130] Schütt KT, Sauceda HE, Kindermans PJ, Tkatchenko A, Müller KR. SchNet-deep [158] Kammeraad JA, Goetz J, Walker EA, Tewari A, Zimmerman PM. What does the
learning architecture for molecules and materials. J Chem Phys 2018;148 machine learn? Knowledge representations of chemical reactivity. J Chem Inf
(24):241722. Model 2020;60(3):1290–301.
[131] Schütt KT, Kessel P, Gastegger M, Nicoli KA, Tkatchenko A, Müller KR. [159] Kovács DP, McCorkindale W, Lee AA. Quantitative interpretation explains
SchNetPack: a deep learning toolbox for atomistic systems. J Chem Theory machine learning models for chemical reaction prediction and uncovers bias.
Comput 2019;15(1):448–55. Nat Comm 2021;12:1695.
[132] Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. [160] Preuer K, Klambauer G, Rippmann F, Hochreiter S, Unterthiner T.
Scikit-learn: machine learning in Python. J Mach Learn Res 2011;12:2825–30. Interpretable deep learning in drug discovery. In: Samek W, Montavon G,
[133] Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, et al. TensorFlow: a Vedaldi A, Hansen LK, Müller KR, editors. Explainable AI: interpreting,
system for large-scale machine learning. In: Proceedings of the 12th USENIX explaining and visualizing deep learning. Cham: Springer International
Symposium on Operating Systems Design and Implementation (OSDI’16); Publishing; 2019. p. 331–45.
2016 Nov 2–4; Savannah, GA, USA. Northbrook: USENIX; 2016. [161] Begoli E, Bhattacharya T, Kusnezov D. The need for uncertainty quantification
[134] Chollet F. Keras [internet]. San Francisco: GitHub, Inc.; 2021 Jun 18 [cited in machine-assisted medical decision making. Nat Mach Intell 2019;1
2021 Jan 4]. Available from: https://github.com/keras-team/keras. (1):20–3.
[135] Paszke A, Gross S, Massa F, Lerer A, Chintala S. Pytorch: an imperative style, [162] Mohamed L, Christie MA, Demyanov V. Comparison of stochastic sampling
high-performance deep learning library. In: Proceedings of 33rd Conference algorithms for uncertainty quantification. SPE J 2010;15(01):31–8.
on Neural Information Processing Systems; 2019 Dec 8–14; Vancouver, BC, [163] Gal Y, Ghahramani Z. Dropout as a Bayesian approximation: representing
Canada. New York: Neural Information Processing Systems Foundation, Inc.; model uncertainty in deep learning. 2016. arXiv:1506.02142v6.
2019. [164] Fridlyand A, Johnson MS, Goldsborough SS, West RH, McNenly MJ, Mehl M,
[136] Bininda-Emonds ORP, Jones KE, Price SA, Cardilloe M, Grenyer R, Purvis A. et al. The role of correlations in uncertainty quantification of transportation
Garbage in, garbage out. In: Bininda-Emonds ORP, editor. Phylogenetic relevant fuel models. Combust Flame 2017;180:239–49.
supertrees. Berlin: Springer; 2004. p. 267–80. [165] Parker WS. Ensemble modeling, uncertainty and robust predictions. Wiley
[137] Schubert E, Gertz M. Intrinsic t-stochastic neighbor embedding for Interdiscip Rev Clim Change 2013;4(3):213–23.
visualization and outlier detection. In: Proceedings of International [166] Gneiting T, Raftery AE. Weather forecasting with ensemble methods. Science
Conference on Similarity Search and Applications; 2017 Oct 4–6; Munich, 2005;310(5746):248–9.
Germany. Berlin: Springer; 2017. p. 188–203. [167] Derome J. On the average errors of an ensemble of forecasts. Atmos Ocean
[138] Perez H, Tah JHM. Improving the accuracy of convolutional neural networks 1981;19(2):103–27.
by identifying and removing outlier images in datasets using t-SNE. [168] Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks.
Mathematics 2020;8(5):662. 2017. arXiv:1703.01365.
[139] Çelik M, Dadasßer-Çelik F, Dokuz ASß. Anomaly detection in temperature data [169] Datta A, Sen S, Zick Y. Algorithmic transparency via quantitative input
using DBSCAN algorithm. In: Proceedings of 2011 International Symposium influence: theory and experiments with learning systems. In: Proceedings of

10
M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver et al. Engineering xxx (xxxx) xxx

2016 IEEE Symposium on Security and Privacy (SP); 2016 May 22–26; San [175] Hutson M. Artificial intelligence faces reproducibility crisis. Science 2018;359
Jose, CA, USA. New York: IEEE. p. 598–617. (6377):725–6.
[170] Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey [176] Gundersen OE, Kjensmo S. State of the art: reproducibility in artificial
of methods for explaining black box models. ACM Comput Surv 2018;51 intelligence. In: Proceedings of The Thirty-Second AAAI Conference on
(5):93. Artificial Intelligence (AAAI-18); 2018 Feb 2–7; New Orleans, LA, USA. Palo
[171] Lin AI, Madzhidov TI, Klimchuk O, Nugmanov RI, Antipin IS, Varnek Alto: AAAI Press; 2018. p. 1644–51.
A. Automatized assessment of protective group reactivity: a step [177] Baker M. 1,500 scientists lift the lid on reproducibility. Nature
toward big reaction data analysis. J Chem Inf Model 2016;56(11): 2016;533:452–4.
2140–8. [178] Fenn J, Linden A. Understanding Gartner’s hype cycles. Report. Stamford:
[172] Pesciullesi G, Schwaller P, Laino T, Reymond JL. Transfer learning enables the Gartner, Inc.; 2003 May. Report No: R-20-1971.
molecular transformer to predict regio- and stereoselective reactions on [179] Sicular S. Vashisth S. Hype cycle for artificial intelligence, 2020 [Internet].
carbohydrates. Nat Commun 2020;11(1):4874. Reading: CloudFactory; 2020 Jul 27 [cited 2021 Jan 4]. Available from: https://
[173] Melchers RE. On the ALARP approach to risk management. Reliab Eng Syst Saf www.cloudfactory.com/reports/gartner-hype-cycle-for-artificial-intelligence.
2001;71(2):201–8. [180] Symoens SH, Aravindakshan SU, Vermeire FH, De Ras K, Djokic MR, Marin GB,
[174] Hutson M. Has artificial intelligence become alchemy? Science 2018;360 et al. QUANTIS: data quality assessment tool by clustering analysis. Int J
(6388):478. Chem Kinet 2019;51(11):872–85.

11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy