Neural-Symbolic Computing An Effective Methodology
Neural-Symbolic Computing An Effective Methodology
Abstract
Current advances in Artificial Intelligence and machine learning in general,
and deep learning in particular have reached unprecedented impact not only
across research communities, but also over popular media channels. However,
concerns about interpretability and accountability of AI have been raised by in-
fluential thinkers. In spite of the recent impact of AI, several works have identi-
fied the need for principled knowledge representation and reasoning mechanisms
integrated with deep learning-based systems to provide sound and explainable
models for such systems. Neural-symbolic computing aims at integrating, as
foreseen by Valiant, two most fundamental cognitive abilities: the ability to
learn from the environment, and the ability to reason from what has been
learned. Neural-symbolic computing has been an active topic of research for
many years, reconciling the advantages of robust learning in neural networks
and reasoning and interpretability of symbolic representation. In this paper,
we survey recent accomplishments of neural-symbolic computing as a princi-
pled methodology for integrated machine learning and reasoning. We illustrate
the effectiveness of the approach by outlining the main characteristics of the
methodology: principled integration of neural learning with symbolic knowl-
edge representation and reasoning allowing for the construction of explainable
AI systems. The insights provided by neural-symbolic computing shed new
light on the increasingly prominent need for interpretable and accountable AI
systems.
1 Introduction
Current advances in Artificial Intelligence (AI) and machine learning in general,
and deep learning in particular have reached unprecedented impact not only within
the academic and industrial research communities, but also among popular media
channels. Deep learning researchers have achieved groundbreaking results and built
AI systems that have in effect rendered new paradigms in areas such as computer
vision, game playing, and natural language processing [27, 45]. Nonetheless, the
impact of deep learning has been so remarkable that leading entrepreneurs such as
Elon Musk and Bill Gates, and outstanding scientists such as Stephen Hawking have
voiced strong concerns about AI’s accountability, impact on humanity and even on
the future of the planet [40].
Against this backdrop, researchers have recognised the need for offering a better
understanding of the underlying principles of AI systems, in particular those based
on machine learning, aiming at establishing solid foundations for the field. In this
respect, Turing Award Winner Leslie Valiant had already pointed out that one of
the key challenges for AI in the coming decades is the development of integrated
reasoning and learning mechanisms, so as to construct a rich semantics of intelligent
cognitive behavior [54]. In Valiant’s words: “The aim here is to identify a way of
looking at and manipulating commonsense knowledge that is consistent with and can
support what we consider to be the two most fundamental aspects of intelligent cog-
nitive behavior: the ability to learn from experience, and the ability to reason from
what has been learned. We are therefore seeking a semantics of knowledge that can
computationally support the basic phenomena of intelligent behavior." In order to
respond to these scientific, technological and societal challenges which demand reli-
able, accountable and explainable AI systems and tools, the integration of cognitive
abilities ought to be carried out in a principled way.
Neural-symbolic computing aims at integrating, as put forward by Valiant, two
most fundamental cognitive abilities: the ability to learn from experience, and the
ability to reason from what has been learned [2, 12, 16]. The integration of learning
and reasoning through neural-symbolic computing has been an active branch of AI
research for several years [14, 16, 17, 21, 25, 42, 53]. Neural-symbolic computing aims
at reconciling the dominating symbolic and connectionist paradigms of AI under a
principled foundation. In neural-symbolic computing, knowledge is represented in
symbolic form, whereas learning and reasoning are computed by a neural network.
Thus, the underlying characteristics of neural-symbolic computing allow the princi-
pled combination of robust learning and efficient inference in neural networks, along
with interpretability offered by symbolic knowledge extraction and reasoning with
logical systems.
2
Neural-symbolic computing
3
Garcez et al.
pointed out in [17] KBANN served as inspiration in the construction of the CILP
system. CILP provides a sound theoretical foundation to inductive learning and
reasoning in artificial neural networks through theorems showing how logic program-
ming can be a knowledge representation language for neural networks. The KBANN
system was the first to allow for learning with background knowledge in neural net-
works and knowledge extraction, with relevant applications in bioinformatics. CILP
allowed for the integration of learning, reasoning and knowledge extraction in re-
current networks. An important result of CILP was to show how neural networks
endowed with semi-linear neurons approximate the fixed-point operator of proposi-
tional logic programs with negation. This result allowed applications of reasoning
and learning using backpropagation and logic programs as background knowledge
[17].
Notwithstanding, the need for richer cognitive models soon demanded the rep-
resentation and learning of other forms of reasoning, such as temporal reasoning,
reasoning about uncertainty, epistemic, constructive and argumentative reasoning
[16, 54]. Modal and temporal logic have achieved first class status in the formal
toolboxes of AI and Computer Science researchers. In AI, modal logics are amongst
the most widely used logics in the analysis and modelling of reasoning in distributed
multiagent systems. In the early 2000s, researchers then showed that ensembles of
CILP neural networks, when properly set up, can compute the modal fixed-point
operator of modal and temporal logic programs. In addition to these results, such
ensembles of neural networks were shown to represent the possible world semantics of
modal propositional logic, fragments of first order logic and of linear temporal logics.
In order to illustrate the computational power of Connectionist Modal Logics (CML)
and Connectionist Temporal Logics of Knowledge (CTLK) [8, 9], researchers were
able to learn full solutions to several problems in distributed, multiagent learning
and reasoning, including the Muddy Children Puzzle [8] and the Dining Philosophers
Problem [26].
By combining temporal logic with modalities, one can represent knowledge and
learning evolution in time. This is a key insight, allowing for temporal evolution
of both learning and reasoning in time (see Fig. 1). The Figure represents the
integrated learning and reasoning process of CTLK. At each time point (or one
state of affairs), e.g. t2 , knowledge which the agents are endowed with and what the
agents have learned at the previous time t1 is represented. As time progresses, linear
evolution of the agents’ knowledge is represented in time as more knowledge about
the world (what has been learned) is represented. Fig. 1 illustrates this dynamic
property of CTLK, which allows not only the analysis of the current state of affairs
but also of how knowledge and learning evolve over time.
Modal and temporal reasoning, when integrated with connectionist learning pro-
4
Neural-symbolic computing
5
Garcez et al.
6
Neural-symbolic computing
method allows the use of various data types and more complex sets of rules. With
CILP, knowledge given in Eq. (1) can be encoded in a neural network as shown in
Figure 2b. In order to adapt this system to first-order logic, CILP++ [15] makes use
of techniques from Inductive Logic Programming (ILP). In CILP++, examples and
background knowledge are converted into propositional clauses by a bottom-clause
propositionalisation technique, which are then encoded into an ANN with recurrent
connections as done by CILP.
(a) Higher-order network for Penalty (b) RBM with confidence rules.
Logic.
Figure 3: Knowledge representation of φ = {w : A ← B ∧ C, w : B ← C ∧ ¬D ∧ E, w :
D ← E} using Penalty logic and Confidence rules
7
Garcez et al.
8
Neural-symbolic computing
tion symbols, there are infinitely many ground terms. Second, propositionalization
seems to generate lots of irrelevant clauses.
3.2.2 Tensorisation
Figure 4: Logic tensor network for P (x, y) → A(y) with G(x) = v and G(y) = u;
G are grounding (vector representation) for symbols in first-order language; and the
tensor order in this example is 2 [42].
9
Garcez et al.
bols for efficient reasoning [6], inductive programming [14, 57, 13], and differentiable
theorem proving [39].
4 Neural-Symbolic Learning
4.1 Inductive Logic Programming
Inductive logic programming (ILP) can take advantage of the learning capability
of neural-symbolic computing to automatically construct a logic program from ex-
amples. Normally, approaches in ILP are categorised into bottom-up and top-down
which inspire the development of neural-symbolic approaches accordingly for learn-
ing logical rules.
Bottom-up approaches construct logic programs by extracting specific clauses
from examples. After that, generalisation procedures are usually applied to search
for more general clauses. This is well suited to the idea of propositionalisation
discussed earlier in Section 3.2.1. For example, CILP++ [15] employed a bottom
clause propositionalisation technique to construct CILP++. In [52], a system called
CRILP is proposed by integrating bottom clauses generated from [15] with RBMs.
However, both CILP++ and CRILP learn and fine-tune formulas at a propositional
level where propositionalisation would generate a large number of long clauses re-
sulting in very large networks. This leaves an open research question of generalising
bottom clauses within neural networks that scale well and can extrapolate.
Top-down approaches, on the other hand, construct logic programs from the most
general clauses and extend them to be more specific. In neural-symbolic terms, the
10
Neural-symbolic computing
most popular idea is to take advantage of neural networks’ learning and inference
capabilities to fine-tune and test the quality of rules. This can be done by replacing
logical operations by differentiable operations. For example, in Neural Logic Pro-
gramming (NLP) [57], learning of rules are based on the differentiable inference of
TensorLog [6]. Here, matrix computations are used to soften logic operators where
the confidence of conjunctions and confidence of disjunctions are computed as prod-
uct and sum, respectively. NLP generate rules from facts, starting with the most
general ones. In Differentiable Inductive Logic Programming (∂ILP) [14], rules are
generated from templates, which are assigned to parameters (weights) to make the
loss function between actual conclusions and predicted conclusions from forward
chaining differentiable. In [39], Neural Theorem Prover (NTP) is proposed by ex-
tending the backward chaining method to be differentiable. It shows that latent
predicates from rule templates can be learned through optimisation of their dis-
tributed representations. Different from [57, 14, 39] where clauses are generated and
then softened by neural networks, in Neural Logic Machines (NLM) [13] the rela-
tion of predicates is learned by a neural network where input tensors represent facts
(predicates of different arities) from a knowledge base and output tensors represent
new facts.
11
Garcez et al.
5 Neural-symbolic Reasoning
Reasoning is an important feature of a neural-symbolic system and has recently
attracted much attention from the research community [14]. Various attempts have
been made to perform reasoning within neural networks, both model-based and
theorem proving approaches. In neural-symbolic integration the main focus is the
integration of reasoning and learning, so that a model-based approach is preferred.
Most theorem proving systems based on neural networks, including first-order logic
reasoning systems such as SHRUTI [56], have been unable to perform learning as
effectively as end-to-end differentiable learning systems. On the other hand, model-
based approaches have been shown implementable in neural networks in the case
of nonmonotonic, intuitionistic and propositional modal logic, as well as abductive
reasoning and other forms of human reasoning [2, 5]. As a result, the focus of
neural-symbolic computation has changed from performing symbolic reasoning in
neural networks, such as for example implementing the logical unification algorithm
in a neural network, to the combination of learning and reasoning, in some cases
with a much more loosely-defined approach rather than full integration, whereby a
12
Neural-symbolic computing
hybrid system will contain different components which may be neural or symbolic
and which communicate with each other.
13
Garcez et al.
cient with feedforward propagation. This has made LTNs applicable successfully to
the Pascal data set and image understanding [12]. Penalty logic shows an equivalence
between minimising violation and minimising energy functions of symmetric connec-
tionist networks [35]. Confidence rules, another approximation approach, shows the
relation between sampling in restricted Boltzmann machines and search for truth-
assignments which maximise satisfiability. The use of confidence rules also allows
one to measure how confident a neural network is in its own answers. Based on that,
neural-symbolic system “confidence rule inductive logic programming (CRILP)” was
constructed and applied to inductive logic programming [52].
6 Neural-symbolic Explainability
The (re)emergence of deep networks has again raised the question of explainability.
The complex structure of a deep neural network turns them into a powerful learning
system if one can correctly engineer its components such as type of hidden units,
regularisation and optimization methods. However, limitations of some AI applica-
tions have heightened the need for explainability and interpretability of deep neural
networks. More importantly, besides improving deep neural networks for better ap-
plications one should also look for the benefits that deep networks can offer in terms
of knowledge acquisition.
14
Neural-symbolic computing
combinatorial approaches do not scale well to deal with the dimensionality of current
networks. As a result, gradually less attention has been paid to knowledge extrac-
tion until recently when the combination of global and local approaches started to be
investigated. The idea here is either to create modular networks with rule extraction
applying to specific modules or to consider rule extraction from specific layers only.
In [50, 51], it has been shown that while extracting conjunctive clauses from the
first layer of a deep belief network is fast and effective, extraction in higher layers
results in a loss of accuracy. A trained deep network can be employed instead for
extraction of soft-logic rules which is less formal but more flexible [23]. Extraction
of temporal rules have been studied in [34] and generated semantic relations of
domain variables over time. Besides formal logical knowledge, hierarchical Boolean
expressions can be learned from images for object detection and recognition [44].
7 Conclusions
In this paper, we highlighted the key ideas and principles of neural-symbolic comput-
ing. In order to do so, we illustrated the main methodological approaches which allow
for the integration of effective neural learning with sound symbolic-based, knowledge
representation and reasoning methods. One of the principles we highlighted in the
paper is the sound mapping between symbolic rules and neural networks provided by
15
Garcez et al.
16
Neural-Symbolic Computing
Propositional First-order Temporal Logic Horizontal Vertical Chaining Approximate Relationship Knowledge Symbolic Program
Logic
Rule-based Logic Logic Programming Hybrid Hybrid SAT Reasoning Extraction Generation Synthesis
Formular-based
Propositionalisation
Tensorisation
Forward
Backward
KBANN[49] ✓ ✓ ✓
CILP[7] ✓ ✓ ✓
CILP++[15] ✓ ✓ ✓ ✓
Penalty Logic [35] ✓ ✓ ✓
Confidence rules[51] ✓ ✓ ✓
CRILP[52] ✓ ✓ ✓ ✓
[23] ✓ ✓ ✓
[4], [41], NTN[47], ✓ ✓
[48], etc.
[29],[41],[58], etc. ✓
LTN[42,12] ✓ ✓ ✓
CTLK[8,9], SCTL[5], ✓
NSCA[34] ✓ ✓ ✓
TPRN[32] ✓ ✓
TensorLog[6] ✓ ✓
NLP[57] ✓ ✓ ✓
𝜕ILP[14] ✓ ✓ ✓
NTP[39] ✓ ✓ ✓
NLM[13] ✓ ✓ ✓
DeepProbLog[28] ✓ ✓
NSPS[33] ✓ ✓
Garcez et al.
References
[1] R. Andrews, J. Diederich, and A. Tickle. Survey and critique of techniques for ex-
tracting rules from trained artificial neural networks. Know.-Based Syst., 8(6):373–389,
December 1995.
[2] S. Bader and P. Hitzler. Dimensions of neural-symbolic integration: a structured survey.
In S. Artemov, H. Barringer, A. d’Avila Garcez, L. Lamb, and J. Woods, editors, We
Will Show Them! Essays in Honour of Dov Gabbay, 2005.
[3] A. Bordes, X. Glorot, J. Weston, and Y. Bengio. Joint learning of words and meaning
representations for open-text semantic parsing. In AISTATS, pages 127–135, 2012.
[4] A. Bordes, J. Weston, R. Collobert, and Y. Bengio. Learning structured embeddings
of knowledge bases. In AAAI’11, 2011.
[5] R. Borges, A. d’Avila Garcez, and L.C. Lamb. Learning and representing temporal
knowledge in recurrent networks. IEEE Trans. Neural Networks, 22(12):2409–2421,
2011.
[6] William W. Cohen, Fan Yang, and Kathryn Mazaitis. Tensorlog: Deep learning meets
probabilistic dbs. CoRR, abs/1707.05390, 2017.
[7] A. d’Avila Garcez and G. Zaverucha. The connectionist inductive learning and logic
programming system. Appl. Intelligence, 11(1):59âĂŞ77, 1999.
[8] A.S. d’Avila Garcez and L.C. Lamb. Reasoning about time and knowledge in neural
symbolic learning systems. In NIPS, pages 921–928, 2003.
[9] A.S. d’Avila Garcez and L.C. Lamb. A connectionist computational model for epistemic
and temporal reasoning. Neur. Computation, 18(7):1711–1738, 2006.
[10] Luc De Raedt, Angelika Kimmig, and Hannu Toivonen. Problog: A probabilistic prolog
and its application in link discovery. In Proceedings of the 20th International Joint
Conference on Artifical Intelligence, IJCAI’07, pages 2468–2473, San Francisco, CA,
USA, 2007. Morgan Kaufmann Publishers Inc.
[11] Frank DiMaio and Jude Shavlik. Learning an approximation to inductive logic pro-
gramming clause evaluation. In Rui Camacho, Ross King, and Ashwin Srinivasan,
editors, Inductive Logic Programming, pages 80–97, Berlin, Heidelberg, 2004. Springer
Berlin Heidelberg.
[12] I. Donadello, L. Serafini, and A. S. d’Avila Garcez. Logic tensor networks for semantic
image interpretation. In IJCAI-17, pages 1596–1602, 2017.
[13] Honghua Dong, Jiayuan Mao, Tian Lin, Chong Wang, Lihong Li, and Denny Zhou.
Neural logic machines. 2019.
[14] R. Evans and E. Grefenstette. Learning explanatory rules from noisy data. JAIR,
61:1–64, 2018.
[15] M. França, G. Zaverucha, and A. Garcez. Fast relational learning using bottom clause
propositionalization with artificial neural networks. Mach. Learning, 94(1):81–104,
2014.
[16] A. d’Avila Garcez, L.C. Lamb, and D.M. Gabbay. Neural-Symbolic Cognitive Reasoning.
18
Neural-symbolic computing
Springer, 2009.
[17] A.S. d’Avila Garcez, D. Gabbay, and K.. Broda. Neural-Symbolic Learning System:
Foundations and Applications. Springer, 2002.
[18] Ross Girshick. Fast r-cnn. In Proceedings of the 2015 IEEE International Conference on
Computer Vision (ICCV), ICCV ’15, pages 1440–1448, Washington, DC, USA, 2015.
IEEE Computer Society.
[19] Marco Gori. Machine Learning: A Constraint-Based Approach. Morgan Kaufmann,
2018.
[20] Kalanit Grill-Spector and Rafael Malach. The human visual cortex. Annual Review of
Neuroscience, 27(1):649–677, 2004.
[21] B. Hammer and P. Hitzler, editors. Perspectives of Neural-Symbolic Integration.
Springer, 2007.
[22] Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele,
and Trevor Darrell. Generating visual explanations. In Computer Vision - ECCV 2016
- 14th European Conference, pages 3–19, 2016.
[23] Z. Hu, X. Ma, Z. Liu, E. Hovy, and E. Xing. Harnessing deep neural networks with
logic rules. In ACL, 2016.
[24] Henrik Jacobsson. Rule extraction from recurrent neural networks: A taxonomy and
review. Neural Comput., 17(6):1223–1263, June 2005.
[25] R. Khardon and D. Roth. Learning to reason. J. ACM, 44(5), 1997.
[26] L.C. Lamb, R.V. Borges, and A.S. d’Avila Garcez. A connectionist cognitive model for
temporal synchronisation and learning. In AAAI, pages 827–832, 2007.
[27] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
[28] Robin Manhaeve, Sebastijan Dumancic, Angelika Kimmig, Thomas Demeester, and
Luc De Raedt. Deepproblog: Neural probabilistic logic programming. In S. Bengio,
H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors,
Advances in Neural Information Processing Systems 31, pages 3749–3759. Curran As-
sociates, Inc., 2018.
[29] Jiayuan Mao, Chuang Gan, Pushmeet Kohli, Joshua B. Tenenbaum, and Jiajun Wu.
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From
Natural Supervision. In International Conference on Learning Representations, 2019.
[30] Stephen Muggleton. Inverse entailment and progol. New Generation Computing,
13(3):245–286, Dec 1995.
[31] Stephen Muggleton. Stochastic logic programs. In New Generation Computing. Aca-
demic Press, 1996.
[32] H. Palangi, P. Smolensky, X. He, and L. Deng. Question-answering with grammatically-
interpretable representations. In AAAI, 2018.
[33] E. Parisotto, A.-R. Mohamed, R. Singh, L. Li, D. Zhou, and P. Kohli. Neuro-symbolic
program synthesis. In ICLR, 2017.
[34] L. de Penning, A. d’Avila Garcez, L.C. Lamb, and J-J. Meyer. A neural-symbolic
cognitive agent for online learning and reasoning. In IJCAI, pages 1653–1658, 2011.
19
Garcez et al.
20
Neural-symbolic computing
[51] S. Tran and A. Garcez. Deep logic networks: Inserting and extracting knowledge from
deep belief networks. IEEE T. Neur. Net. Learning Syst., (29):246–258, 2018.
[52] Son N. Tran. Propositional knowledge representation and reasoning in restricted boltz-
mann machines. CoRR, abs/1705.10899, 2018.
[53] L. Valiant. Knowledge infusion. In AAAI, 2006.
[54] L.G. Valiant. Three problems in computer science. J. ACM, 50(1):96–99, 2003.
[55] Qinglong Wang, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, and
C. Lee Giles. An empirical evaluation of rule extraction from recurrent neural networks.
Neural Computation, 30(9):2568–2591, 2018.
[56] Carter Wendelken and Lokendra Shastri. Multiple instantiation and rule mediation in
SHRUTI. Connect. Sci., 16(3):211–217, 2004.
[57] Fan Yang, Zhilin Yang, and William W Cohen. Differentiable learning of logical rules
for knowledge base reasoning. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach,
R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information
Processing Systems 30, pages 2319–2328. Curran Associates, Inc., 2017.
[58] Kexin Yi, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli, and Josh Tenen-
baum. Neural-symbolic vqa: Disentangling reasoning from vision and language under-
standing. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi,
and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages
1031–1042. Curran Associates, Inc., 2018.
Received \jreceived