Cognitive-Computing Unit 1
Cognitive-Computing Unit 1
UNIT - I
Introduction to Cognitive Science: Understanding Cognition, IBM’s Watson, Design for Human
Cognition, Augmented Intelligence, Cognition Modeling Paradigms: Declarative/ logic-based
computational cognitive modeling, connectionist models of cognition, Bayesian models of
cognition, a dynamical systems approach to cognition.
Cognitive computing is a technology approach that enables humans to collaborate with machines. If
you look at cognitive computing as an analog to the human brain, you need to analyze in context all
types of data, from structured data in databases to unstructured data in text, images, voice, sensors,
and video. These are machines that operate at a different level than traditional IT systems because
they analyze and learn from this data. A cognitive system has three fundamental principles as
described below:
■ Learn—A cognitive system learns. The system leverages data to make inferences about a domain, a
topic, a person, or an issue based on training and observations from all varieties, volumes, and
velocity of data.
■ Model—To learn, the system needs to create a model or representation of a domain (which includes
internal and potentially external data) and assumptions that dictate what learning algorithms are used.
Understanding the context of how the data fits into the model is key to a cognitive system.
■ Generate hypotheses—A cognitive system assumes that there is not a single correct answer. The
most appropriate answer is based on the data itself. Therefore, a cognitive system is probabilistic. A
hypothesis is a candidate explanation for some of the data already understood. A cognitive system
uses the data to train, test, or score a hypothesis.
Cognitive computing is an evolution of technology that attempts to make sense of a complex world
that is drowning in data in all forms and shapes. You are entering a new era in computing that will
transform the way humans collaborate with machines to gain actionable insights. It is clear that
technological innovations have transformed industries and the way individuals conduct their daily
lives for decades. In the 1950s, transactional and operational processing applications introduced huge
efficiencies into business and government operations. Organizations standardized business processes
and managed business data more efficiently and accurately than with manual methods. However, as
the volume and diversity of data has increased exponentially, many organizations cannot turn that data
into actionable knowledge. The amount of new information an individual needs to understand or
analyze to make good decisions is overwhelming. The next generation of solutions combines some
traditional technology techniques with innovations so that organizations can solve vexing problems.
Cognitive computing is in its early stages of maturation. Over time, the techniques that are discussed
in this book will be infused into most systems in future years. The focus of this book is this new
approach to computing that can create systems that augment problem‐solving capabilities.
S_B_BIET 1
The Uses of Cognitive Systems
Cognitive systems are still in the early days of evolution. Over the coming decade you will see cognitive
capabilities built into many different applications and systems. There will be new uses that emerge that
are either focused on horizontal issues (such as security) or industry‐specific problems (such as
determining the best way to anticipate retail customer requirements and increase sales, or to diagnose
an illness). Today, the initial use cases include some new frontiers and some problems that have
confounded industries for decades. For example, systems are being developed that can enable a city
manager to anticipate when traffic will be disrupted by weather events and reroute that traffic to avoid
problems. In the healthcare industry, cognitive systems are under development that can be used in
collaboration with a hospital’s electronic medical records to test for omissions and improve accuracy.
The cognitive system can help to teach new physicians medical best practices and improve clinical
decision making. Cognitive systems can help with the transfer of knowledge and best practices in other
industries as well. In these use cases, a cognitive system is designed to build a dialog between human
and machine so that best practices are learned by the system as opposed to being programmed as a set
of rules. The list of potential uses of a cognitive computing approach will continue to grow over time.
The initial frontier in cognitive computing development has been in the area of healthcare because it is
rich in text‐based data sources. In addition, successful patient outcomes are often dependent on care
providers having a complete, accurate, up‐to‐date understanding of patient problems. If medical
cognitive applications can be developed that enable physicians and caregivers to better understand
treatment options through continuous learning, the ability to treat patients could be dramatically
improved. Many other industries are testing and developing cognitive applications as well. For example,
bringing together unstructured and semi-structured data that can be used within metropolitan areas can
greatly increase our understanding of how to improve the delivery of services to citizens. “Smarter city”
applications enable managers to plan the next best action to control pollution, improve the traffic flow,
and help fight crime. Even traditional customer care and help desk applications can be dramatically
improved if systems can learn and help provide fast resolution of customer problems. What Makes a
System Cognitive? Three important concepts help make a system cognitive: contextual insight from the
model, hypothesis generation (a proposed explanation of a phenomenon), and continuous learning from
data across time. In practice, cognitive computing enables the examination of a wide variety of diverse
types of data and the interpretation of that data to provide insights and recommend actions. The essence
of cognitive computing is the acquisition and analysis of the right amount of information in context
with the problem being addressed. A cognitive system must be aware of the context that supports the
data to deliver value. When that data is acquired, curated, and analyzed, the cognitive system must
identify and remember patterns and associations in the data. This iterative process enables the system
to learn and deepen its scope so that understanding of the data improves over time. One of the most
important practical characteristics of a cognitive system is the capability to provide the knowledge
seeker with a series of alternative answers along with an explanation of the rationale or evidence
supporting each answer. A cognitive computing system consists of tools and techniques, including Big
Data and analytics, machine learning, Internet of Things (IoT), Natural Language Processing (NLP),
causal induction, probabilistic reasoning, and data visualization. Cognitive systems have the capability
to learn, remember, provoke, analyze, and resolve in a manner that is contextually relevant to the
organization or to the individual user. The solutions to highly complex problems require the assimilation
of all sorts of data and knowledge that is available from a variety of structured, semi‐structured, and
unstructured sources including, but not limited to, journal articles, industry data, images, sensor data,
and structured data from operational and transactional databases. How does a cognitive system leverage
S_B_BIET 2
this data? As you see later in this chapter, these cognitive systems employ sophisticated continuous
learning techniques to understand and organize information.
Understanding Cognition Understanding how the human brain works and processes information
provides a blueprint for the approach to cognitive computing. However, it is not necessary to build a
system that replicates all the capabilities of the human brain to serve as a good collaborator for humans.
By understanding cognition we can build systems that have many of the characteristics required to
continuously learn and adapt to new information. The word cognition, from the Latin root gnosis,
meaning to know and learn, dates back to the 15th century. Greek philosophers were keenly interested
in the field of deductive reasoning.
■ Computer science—The scientific and practical approach to computation and its applications. It is
the systematic technique for translating this theory into practice. The main branches of cognitive science
are psychology (primarily an applied science, in helping diagnose and treat mental/behavioral
conditions) and neurology (also primarily applied, in diagnosis/treatment of neurological conditions).
Over the years, however, it became clear that there was a critical relationship between the way the
human brain works and computer engineering. For example, cognitive scientists, in studying the human
mind, have come to understand that human cognition is an interlinking system of systems that allows
for information to be received from outside inputs, which is then stored, retrieved, transformed, and
transmitted. Likewise, the maturation of the computer field has accelerated the field of cognitive
sciences. Increasingly, there is less separation between these two disciplines.
S_B_BIET 3
in cognition, depending on differences in genetic variations. (A deaf person reacts differently to sound
than a person who hears well.) However, these variations are the exception, not the rule.
To make sense of how different processes in the brain relate to each other and impact each other,
cognitive scientists model cognitive structures and processes. There isn’t a single cognitive architecture;
rather, there are many different approaches, depending on the interaction model. For example, there
may be an architecture that is related to human senses such as seeing, understanding speech, and
reacting to tastes, smells, and touch. A cognitive architecture is also directly tied to how the neurons in
the brain carry out specific tasks, absorb new inputs dynamically, and understand context. All this is
possible even if there is sparse data because the brain can fill in the implied information. The human
brain is architected to deal with the mental processes of perception, memory, judgment, and learning.
Humans can think fast and draw conclusions based on their ability to reason or make inferences from
the pieces of information they are given. Humans have the ability to make speculative conjectures,
construct imaginative scenarios, use intuition, and other cognitive processes that go beyond mere
reasoning, inference, and information processing. The fact that humans have the ability to come up with
a supposition based on sparse data points to the brilliance of human cognition. However, there can be
negative consequences of this inference. The human may have a bias that leads to conclusions that are
erroneous. For example, the human may look at one research study that states that there are some
medical benefits to chocolate and conclude that eating a lot of candy will be a good thing. In contrast,
a cognitive architecture will not make the mistake of assuming that one study or one conclusion has an
overwhelming relevance unless there is actual evidence to draw conclusions. Unlike humans, machines
do not have bias unless that bias is programmed into the system. Traditional architectures rely on
humans to interpret processes into code. AI assumes that computers can replace the thinking process of
humans. With cognitive computing, the human leverages the unique ability of computers to process,
manage, and associate information to expand what is possible.
One of the best ways to understand the potential for cognitive computing is to take a look at one of the
early implementations of a cognitive system. IBM developed Watson as one of its new foundational
offerings intended to help customers build a different type of system based on the ingestion of new
content. IBM’s design focus for Watson was to create solutions based on aggregating data leveraging
techniques ranging from machine learning to Natural Language Processing (NLP) and advanced
analytics. Watson solutions include a set of foundational services combined with industry‐focused best
practices and data. The accuracy of results from a cognitive system continuously improves through an
iterative training process that combines the knowledge of subject matter experts with a corpus of domain
specific data. One of the important capabilities that allows for this machine/human interaction is the
ability to leverage NLP to understand the context of a combination of a variety of unstructured and
structured data sources. In addition, a cognitive system is not constrained to applications that are
deterministic in nature, but can manage probabilistic systems that change and evolve as they are used.
Watson Defined
Watson is a cognitive system that combines capabilities in NLP, analytics, and machine learning
techniques. Watson gains insights and gets smarter with each user interaction and each time that new
information is ingested. By combining NLP, dynamic learning, and hypothesis generation and
evaluation, Watson is intended to help professionals create hypotheses from data, accelerate findings,
and determine the availability of supporting evidence to solve problems. IBM views Watson as a way
to improve business outcomes by enabling humans to interact with machines in a natural way.
Individuals have become accustomed to leveraging sophisticated search engines or database query
systems to discover information to support decision making. Watson, which also facilitates data‐driven
search, takes a different approach that is discussed in detail in this chapter. In essence, Watson leverages
S_B_BIET 4
machine learning, DeepQA, and advanced analytics. IBM Watson’s DeepQA architecture, as illustrated
in Figure 1, is described in this chapter.
S_B_BIET 5
Image: Mauro Mora (Unsplash ) — by Marcos Rezende
For example, when people see a red light while driving, their eyes spot the red light (stimulus), their
visual system processes it and sends signals to their brain (sensation), and then their brain figures out
S_B_BIET 6
that the red light means they need to stop (perception). By understanding this process, user experience
designers can create interfaces that fit how people naturally see and understand things, making them
easier to use.
Perception also involves both bottom-up and top-down processing. Bottom-Up Processing occurs
when stimuli influence perception without preconceived ideas, driven purely by data and sensory input.
For example, when people touch a hot stove, their sensory receptors send an urgent “SOS” to their
brain, leading them to perceive pain and yank their hand away faster than a cat avoiding a bath.
Top-Down Processing happens when previous knowledge and expectations influence perception,
driven by concepts and experiences. For example, if you see a partially obscured sign but can still read
it because you recognize the context, that’s top-down processing at work.
Great design strikes a balance! It should be easy for beginners to figure out (bottom-up), but also work
well for people who already know the system (top-down). By considering both these things, we can
create interfaces that are effective and enjoyable for everyone.
S_B_BIET 7
Designing for Behavior Change
Fogg Behavior Model
To further improve user experience, it’s essential to understand and influence user behavior effectively.
The Fogg Behavior Model helps us grasp how behavior occurs at the intersection of three elements:
• Motivation (Low or High): Provides a reason for someone to engage in the task.
• Ability (Hard or Easy to do): Gives people the opportunity to complete the task.
• Triggers: Occur in our environment or brain and prompt a person to act. Examples:
Onboarding tips, CTA buttons, and notifications.
Therefore, Behavior occurs when users have a blend of motivation and ability. So, if motivation is low,
increasing ability can help. Here are some strategies we can apply the Fogg Behavior Model in our
design initiatives:
Simplifying Interfaces
• Reduce the complexity of tasks by breaking them down into simpler steps.
• Ensure that users can easily understand and complete actions with minimal effort.
Increasing Motivation
• Use persuasive design techniques to improve user engagement.
• Highlight benefits and value to maintain high motivation levels.
Effective Triggers
• Design clear and timely notifications that prompt users at the right moment.
• Utilize easy-to-find and understandable CTA buttons.
While the Fogg Behavior Model helps us design interfaces that encourage desired actions through
transparency and simplicity, the “Black Box” problem highlights what happens when these principles
are not applied.
AUGMENTED INTELLIGENCE
Augmented intelligence (AI), also known as intelligence augmentation (IA) or cognitive augmentation,
is next level in artificial intelligence. The word “augmented” means "to
improve." AI software will simply improve products and services, not
S_B_BIET 8
replace the humans that use them. Augmented intelligence is also AI. To avoid the confusion of using
the same abbreviation for two meanings, let us call it IA: intelligence augmentation. IA refers the
creation of a close-to-human autonomous intelligence using modern technology. It describes how
normal human intelligence is supplemented through the use of technology. One may say that:
Augmented Intelligence = Human + Computer
One may think of IA as augmented reality, a technology which combines real-world environments with
computer generated information such as images, text, videos, animations, and sound. It has the ability
to record and analyze the environment in real time. It is becoming more attractive as a mainstream
technology mainly due to the proliferation of modern mobile computing devices
like smartphones and tablet computers with location-based services. Like augmented reality, augmented
intelligence adds layers of information on top of human intelligence, helping humans to be at their best.
The pioneers of augmentation are the sectors that generate a lot of data, such as the law, healthcare, and
agriculture. The main objective of augmented intelligence is to create an entirely new process automated
and designed for 20% manual exceptions. Augmented intelligence follows a five-function cadence that
allows it to learn with human influence.
S_B_BIET 9
Some companies are already focusing on developing smart data analytics solutions to obtain valuable
insights from big data. Augmented analytics can be used to extract insights from big data. It automates
data insights and provides clearer information, which is not possible with traditional analysis tools.
What is Logic-Based Computational Cognitive Modeling — In a Word?
A particular approach to modeling the mind: declarative computational cognitive modeling. (In light of
the fact that if an agent knows p, p must be a proposition or declarative statement, sometimes the term
‘knowledge-based’ is used in place of ‘declarative.’ Some writers even use the dangerously equivocal
term ‘symbolic.’) Naturally enough, the basic units of such modeling are declarative in nature, or
propositional: they are formal objects naturally associated with those particular sentences or expressions
in natural languages (like English, German, Chinese) that are declarative statements (as opposed to
expressions in the imperative or inquisitive mode) naturally taking values such as true, false, unknown,
probable (sometimes to particular numerical degrees), and so on. The basic process over such units is
inference, which may be deductive, inductive, probabilistic, abductive, or analogical. Because the basic
units of declarative computational cognitive modeling are declarative, a hallmark of declarative
computational cognitive modeling is a top-down, rather than bottom-up, approach.
As Brachman & Levesque (2004) put it, when speaking of declarative computational cognitive
modeling within the field of artificial intelligence:
It is at the very core of a radical idea about how to understand intelligence: instead of trying to
understand or build brains from the bottom up, we try to understand or build intelligent behavior from
the top down. In particular, we ask what an agent would need to know in order to behave intelligently,
and what computational mechanisms could allow this knowledge to be made available to the agent as
required. (Brachman & Levesque 2004, p. iv)
The top-down approach is unavoidable, because, as reflected in relevant formalisms commonly
associated with bottom-up approaches (e.g., artificial neural networks), the basic units in bottom-up
processing are numerical, not declarative. The systematization of declarative computational cognitive
modeling, which is the overarching purpose of the present chapter, is achieved by using formal logic,
and hence declarative computational cognitive modeling, from the formal perspective, becomes logic-
based computational cognitive modeling, sometimes abbreviated below to ease exposition as ‘LCCM.’
Correspondingly, to decrease verbosity and repetition of the phrase, ‘computational cognitive modeling’
will sometimes be abbreviated below as ‘CCM.’ Logic-based computational cognitive modeling is an
interdisciplinary field that cuts across: cognitive modeling based on cognitive architectures (such as
ACT-R, Soar, Clarion, Polyscheme, etc.), logic itself, and computational psychology of reasoning. In
addition, LCCM has a sister in logic-based human-level artificial intelligence (AI), and, being
computational in nature, it inevitably draws heavily from computer science, which is itself, as has been
explained (e.g., in Halpern, Harper, Immerman, Kolaitis, Vardi & Vianu 2001), based on formal logic.
Specifically, and unsurprisingly, the declarative programming paradigm is naturally associated with
declarative computational cognitive modeling. This paradigm, specifically as it applies to LCCM, will
be explained later.
S_B_BIET 10
S_B_BIET 11
Connectionist models of cognition
Over the last twenty years, connectionist modeling has formed an influential approach to the
computational study of cognition. It is distinguished by its appeal to principles of neural computation
to inspire the primitives that are included in its cognitive level models. Also known as artificial neural
network (ANN) or parallel distributed processing (PDP) models, connectionism has been applied to a
diverse range of cognitive abilities, including models of memory, attention, perception, action,
language, concept formation, and reasoning (see, e.g., Houghton, 2005). While many of these models
seek to capture adult function, connectionism places an emphasis on learning internal representations.
This has led to an increasing focus on developmental phenomena and the origins of knowledge.
Although, at its heart, connectionism comprises a set of computational formalisms, it has spurred
vigorous theoretical debate regarding the nature of cognition. Some theorists have reacted by dismissing
connectionism as mere implementation of pre-existing verbal theories of cognition, while others have
viewed it as a candidate to replace the Classical Computational Theory of Mind and as carrying
profound implications for the way human knowledge is acquired and represented; still others have
viewed connectionism as a sub-class of statistical models involved in universal function approximation
and data clustering.
Bayesian models of cognition explain aspects of human behavior as a result of rational probabilistic
inference. In particular, these models make use of Bayes’ rule, which indicates how rational agents
should update their beliefs about hypotheses in light of data. Bayes’ rule provides an optimal solution
to inductive problems for which the observed data are insufficient to distinguish between hypotheses.
Since many of the things that human minds need to do involve inductive problems–—from identifying
the structure of the world based on limited sensory data to inferring what other people think based on
their behavior–—Bayesian models have broad applicability within cognitive science. Being able to
identify what a rational agent would do in these situations provides a way to explain why people might
act similarly, and is a tool for exploring the implicit assumptions underlying human behavior. In
S_B_BIET 12
particular, Bayesian models make it easy to explore the inductive biases that inform people’s inferences,
being those factors other than the data that guide people in selecting one hypothesis over another.
Bayes’ rule
Bayes’ rule indicates how agents should update their degrees of belief in hypotheses given observed
data. Assume that a hypothesis h from a set of hypotheses H is under consideration. The degree of belief
assigned to the hypothesis before observing any data is P(h), known as the prior probability. After
observing data d, the degree of belief assigned to the hypothesis P(h|d) is called the posterior probability
(the | symbol should be read as “given,” so this is the probability of h given, or taking into account, the
information contained in d). Bayes’ rule applies the definition of the conditional probability from
probability theory to give
where P(d|h) is the probability of observing d if h were true, known as the likelihood. The sum in the
denominator simply adds up the same quantity (the product of the prior probability and the likelihood)
over all of the hypothesis in H, making sure that the posterior probability P(h|d) sums to 1 over all
hypotheses. The numerator is thus the key to Bayes’ rule, indicating that how much we believe in a
hypothesis after seeing data should reflect the product of the prior probability of that hypothesis and the
probability of the data if that hypothesis were true.
Intuitively, Bayes’ rule says that our beliefs about hypotheses should be a function of two factors: how
plausible those hypotheses are (as reflected in the prior probability) and how well they fit the observed
data (as reflected in the likelihood). These two factors contribute equally, and do so multiplicatively—
if either one of them is very small, the other has to be very large to compensate. As a simple example,
imagine looking out the window during the summer and seeing gray clouds (the data d). You might
consider three hypotheses: that the day will be sunny, that it will rain, and that there is a nearby forest
fire. Sunny days might be more frequent than rainy days, which are more frequent than days where
there are forest fires, so the prior probability would place these hypotheses in this order. However, gray
clouds are less likely on sunny days than rainy days, and about equally likely when it is rainy or there
is a forest fire, so the likelihood favors rain or forest fire. The product of the prior and likelihood will
favor rain, as it is both plausible and fits the observed data.
S_B_BIET 13