0% found this document useful (0 votes)
488 views25 pages

MLA Question Bank

The document contains a 16 question multiple choice quiz about machine learning and neural networks. The questions cover topics such as the advantages of biological neural networks compared to conventional computers, Hebb's learning rule, neural network activation functions, solving the XOR problem using a multi-layer perceptron, the credit assignment problem in multi-layer networks, and overfitting versus extrapolation.

Uploaded by

Arun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
488 views25 pages

MLA Question Bank

The document contains a 16 question multiple choice quiz about machine learning and neural networks. The questions cover topics such as the advantages of biological neural networks compared to conventional computers, Hebb's learning rule, neural network activation functions, solving the XOR problem using a multi-layer perceptron, the credit assignment problem in multi-layer networks, and overfitting versus extrapolation.

Uploaded by

Arun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Machine Learning – Multiple Choice Questions

Question Bank
1. What are the advantages of biological neural networks (BNNs) compared to conventional
Von Neumann computers?

(i) BNNs have the ability to learn from examples.


(ii) BNNs have a high degree of parallelism.
(iii) BNNs require a mathematical model of the problem.
(iv) BNNs can acquire knowledge by “trial and error”.
(v) BNNs use a sequential algorithm to solve problems.

A. (i), (ii), (iii), (iv) and (v).


B. (i), (ii) and (iii).
C. (i), (ii) and (iv).
D. (i), (iii) and (iv).
E. (i), (iv) and (v).

2. Which of the following statements is the best description of Hebb’s learning rule?
A. “If a particular input stimulus is always active when a neuron fires then its weight should
be increased.”
B. “If a stimulus acts repeatedly at the same time as a response then a connection will form
between the neurons involved. Later, the stimulus alone is sufficient to activate the
response.”
C. “The connection strengths of the neurons involved are modified to reduce the error between
the desired and actual outputs of the system.”

3. Which of the following techniques can NOT be used for pre-processing the inputs to an
artificial neural network?
A. Normalization.
B. Winner-takes-all.
C. Fast Fourier Transform (FFT).
D. Principal component analysis (PCA).
E. Deleting outliers from the training set.

4. Which of the following neural networks uses supervised learning?


a. Self-organizing feature map (SOFM).
b. The Hopfield network.
c. Simple recurrent network (SRN).
d. All of the above answers.
e. None of the above answers.
5. Which of the following algorithms can be used to train a single-layer feedforward network?
A. Hard competitive learning.
B. Soft competitive learning.
C. A genetic algorithm.
D. All of the above answers.
E. None of the above answers.

6. Identify each of the following activation functions.


(i) (ii) (iii)

1 1 1

0.5 0.5 0.5


f(net)

f(net)
f(net)

0 0 0

-0.5 -0.5 -0.5

-1 -1 -1

-4 -2 0 2 -4 -2 0 2 -1 -0.5 0 0.5 1
4 4 net
net net
(iv) (v)

1 1

0.5 0.5
f(net)

f(net)
0 0

A. (i) Unipolar step [hardlim], (ii) Unipolar sigmoid [logsig], (iii) Linear [purelin],
(iv) Bipolar step [hardlims], (v) Bipolar sigmoid [tansig].
B. (i) Unipolar step [hardlim], (ii) Bipolar sigmoid [tansig], (iii) Linear [purelin],
(iv) Bipolar step [hardlims], (v) Unipolar sigmoid [logsig].
C. (i) Bipolar step [hardlims], (ii) Unipolar sigmoid [logsig], (iii) Linear [purelin],
(iv) Unipolar step [hardlim], (v) Bipolar sigmoid [tansig].
D. (i) Bipolar step [hardlims], (ii) Bipolar sigmoid [tansig], (iii) Linear [purelin],
(iv) Unipolar step [hardlim], (v) Unipolar sigmoid [logsig].

7. Is the following statement true or false? “The XOR problem can be solved by a multi-
layer perceptron, but a multi-layer perceptron with purelin activation functions cannot
learn to do this.”
A. TRUE.
B. FALSE
8. Is the following statement true or false? “For any feedforward network, we can always
create an equivalent feedforward network with separate layers.”
A. TRUE.
B. FALSE
9. Is the following statement true or false? “A multi-layer feed forward network with linear
activation functions is more powerful than a single-layer feed forward network with linear
activation functions.”
A. TRUE.
B. FALSE
10. What is the credit assignment problem in a multi-layer feedforward network?
A. The problem of adjusting the weights for the output units.
B. The problem of adapting the neighbours of the winning unit.
C. The problem of defining an error function for linearly inseparable problems.
D. The problem of avoiding local minima in the error function.
E. The problem of adjusting the weights for the hidden units.
11. Which of the following equations best describes the Generalized Delta Rule with
momentum?
A. Owji(t + 1) = ηδjxi
B. Owji(t + 1) = αδjxi
C. Owji(t + 1) = ηδjxi + αOwji(t)
D. Owji(t + 1) = ηδjxi + αδjxi(t)
E. Owji(t + 1) = η (xi − wji) + αOwji(t)
WhereO wji(t) is the change to the weight from unit i to unit j at time t, η
is the learning rate, α is the momentum coefficient, δj is the error term
for unit j, and xi is the ith input to unit j.

12. One method for dealing with local minima is to use a committee of networks.
What does this mean?
A. Large number of different networks are trained and tested. The
network with the lowest sum squared error on a separate validation set
is chosen as the best network.
B. Large number of different networks are trained and tested. All of the
networks are used to solve the real-world problem by taking the average
output of all the networks.
C. Large number of different networks are trained and tested. The
networks are then combined together to make a network of networks,
which is biologically more realistic and computationally more powerful
than a single network.

13. What is the most general type of decision region that can be formed
by a feedforward network with NO hidden layers?
A. Convex decision regions – for example, the network
can approximate any Boolean function.
B. Arbitrary decision regions – the
network can approximate any function
(the accuracy of the approx- imation
depends on the number of hidden
units).
C. Decision regions separated by a line, plane or
hyperplane.
D. None of the above answers.
14. Which of the following statements is the best description of overfitting?
A. The network becomes “specialized” and learns the
training set too well.
B. The network can predict the correct
outputs for test examples which lie
outside the range of the training
examples.
C. The network does not contain enough
adjustable parameters (e.g., hidden
units) to find a good approximation to
the unknown function which
generated the training data.

15. Which of the following statements is the best description of extrapolation?


A. The network becomes “specialized” and learns
the training set too well.
B. The network can predict the
correct outputs for test
examples which lie outside the
range of the training
examples.
C. The network does not contain
enough adjustable parameters
(e.g., hidden units) to find a
good approximation to the
unknown function which
generated the training data.
16.
Neural Networks A Nerve cells in the brain are called neurons
B The output from the neuron is called dendrite
C One kind of neurons is called synapses
D Learning takes place in the synapses
17.
Multilayer A Is a neural network with several layers of nodes (or
perceptron network weights)
B There are connections both between and within each layer
C The number of units in each layer must be equal
D Multiple layers of neurons allow for more complex
decision boundaries than a single layer

18.
Backpropagation A Is a learning algorithm for multilayer perceptron networks
B The backward pass follows after the forward pass
C Is based on a gradient descent technique to maximize the
mean square difference between the desired and actual
outputs
D Is also applicable to self-organizing feature maps

19.
Weight updates in A Usually, the weights are initially set to 0
Back propagation B Are proportional to the difference between the desired and
actual outputs
C The weight change is also proportional to the input to the
weight layer
D The output layer weights are used for computing the error
of the hidden layer

20.
A f(x) is called a sigmoid function
B It is beneficial because it does not limit the output value
C It is called an activation function and such a function is
used on every multilayer perceptron output
D The derivative of the function is (f(x) + 1) f(x)

Answers
Q1 C
Q2 A
Q3 B
Q4 C
Q5 D
Q6 A
Q7 A
Q8 A
Q9 B
Q10 E
Q11 C
Q12 B
Q13 C
Q14 A
Q15 B
Q16 A
Q17 A,D
Q18 A,B
Q19 B,C,D
Q20 A,D

In order to attain a greater cumulative future award when v*(s1)>v*(s2), the evaluation
fuction used by agent to learn is -----------------
A) state S1
B) state S2
C) both S1 and S2
D) None of the above
Answer: A

The agent acquires optimal policy by learning v*


A) intermediate reward function
B) state transition function
C) both of the above
D) None of the above

Answer: C

Q-Learning does not need representing or learning a model, this makes the implementation
of Q-learning
A) easy
B) moderate
C) difficult
D) none

Answer: A

Order the steps of the Q Learning algorithm


1. observe new State S'
2. update table entry for Q(S,a)
3. Receive Intermediate award
4. select an action a
5. execute action a

A) 4-5-3-1-2
B) 2-5-3-4-2
C) 4-1-2-3-5
D) 4-5-3-2-1
Answer: A

One step error is used in Q-Learning algorithm


A) True
B) False

Answer: A

In model free reinforment learning, Learning is from


A) Optimal value function V
B) Optimal Q function
C) None of the above
D) Both
Answer: B

From the following Q(S,a)=12; Q(S,b)=100, Q(S, c)= 67; based on Greedy function which is the
best Q-fucntion value that is chosen
A) Q(S,a)=12;
B) Q(S,b)=100,
C) Q(S,c)= 67;
D) none
Answer: B

Thinking about Reinforcement Learning which ones of the following statements are true
(multiple
choice):
A) The maximization of the future cumulative reward allows to Reinforcement Learning to
perform global decisions with local information
B) Q-learning is a temporal difference RL method that does not need a model of the task to
learn the action value function
C) Reinforcement Learning only can be applied to problems with a finite number of states
D) In Markov Decision Problems (MDP) the future actions from a state depend on the previous
states

Answer: B

The Action Value Function (or “Q-function”) takes two inputs ___________ &
_______________

Answer: Action and State

Optimal policy of agents is based on ---------------


A) actions
B) state
C) both of the above
D) None of the above

Answer: A

In the Q learning Algorithm, At each step choose the action a which --------------- the function
Q(S,a)
A) Minimizes
B) Maximizes
C) Stabalizes
D) None of the above

ANSWER: B

Q learning is based on learning from


A) experience
B) model of the real world
C) experience and model
D) none

Answer: A

The data point is that the agent received the future value of r+ ?V(s'), where V(s') =maxa'
Q(s',a'); this is the actual current reward plus the discounted estimated future value. This new
data point is called a ___________.
A) Return
B) Spatial
C) Global
D) Local

ANSWER: A

In Q-Learning- the agent was in state s, it did action a, it received reward r, and it went into
state s',this experience tuple can be given as ___________
A) ?s,r,a,s'?
B) ?s',a,r,s?
C) ?s,a,r,s'?
D) None

Answer: C

Q-learning uses ____________ differences to estimate the value of Q*(s,a).


A) spatial
B) temporal
C) both
D) None

Answer: B

As an example, consider the process of boarding a train, in which the reward is measured by
the negative of the total time spent boarding (alternatively, the cost of boarding the train is
equal to the boarding time). One strategy is to enter the train door as soon as they open,
minimizing the initial wait time for yourself. If the train is crowded, however, then you will
have a slow entry after the initial action of entering the door as people are fighting you to
depart the train as you attempt to board. The total boarding time, or cost, is then: which is a
better option for the above scenario.
A) 0 seconds wait time + 15 seconds fight time
B) 5 second wait time + 0 second fight time.
C) Both
D) None

Answer: B

In order to attain a greater cumulative future award when v*(s1)>v*(s2), the evaluation
fuction used by agent to learn is -----------------
A) state S1
B) state S2
C) both S1 and S2
D) None of the above

ANSWER : A

The agent acquires optimal policy by learning v*


A) intermediate reward function
B) state transition function
C) both of the above
D) None of the above

ANSWER : C

Q-Learning does not need representing or learning a model, this makes the implementation
of Q-learning
A) easy
B) moderate
C) difficult
D) none

ANSWER : A

Order the steps of the Q Learning algorithm


1. observe new State S'
2. update table entry for Q(S,a)
3. Receive Intermediate award
4. select an action a
5. execute action a

A) 4-5-3-1-2
B) 2-5-3-4-2
C) 4-1-2-3-5
D) 4-5-3-2-1

ANSWER : A

One step error is used in Q-Learning algorithm


A) True
B) False
C) None
D) Either
ANSWER : A
In model free reinforment learning, Learning is from
A) Optimal value function V
B) Optimal Q function
C) None of the above
D) Both
ANSWER : B

From the following Q(S,a)=12; Q(S,b)=100, Q(S, c)= 67; based on Greedy function which is the
best Q-fucntion value that is chosen
A) Q(S,a)=12;
B) Q(S,b)=100,
C) Q(S,c)= 67;
D) none
Answer: B

Thinking about Reinforcement Learning which ones of the following statements are true
(multiple
choice):
A) The maximization of the future cumulative reward allows to Reinforcement Learning to
perform global decisions with local information
B) Q-learning is a temporal difference RL method that does not need a model of the task to
learn the action value function
C) Reinforcement Learning only can be applied to problems with a finite number of states
D) In Markov Decision Problems (MDP) the future actions from a state depend on the previous
states

ANSWER : B

Optimal policy of agents is based on ---------------


A) actions
B) state
C) both of the above
D) None of the above
ANSWER : A
In the Q learning Algorithm, At each step choose the action a which --------------- the function
Q(S,a)
A) Minimizes
B) Maximizes
C) Stabalizes
D) None of the above

ANSWER: B

Q learning is based on learning from


A) experience
B) model of the real world
C) experience and model
D) none

ANSWER : A

The data point is that the agent received the future value of r+ ?V(s'), where V(s') =maxa'
Q(s',a'); this is the actual current reward plus the discounted estimated future value. This new
data point is called a ___________.
A) Return
B) Spatial
C) Global
D) Local

ANSWER: A

In Q-Learning- the agent was in state s, it did action a, it received reward r, and it went into
state s',this experience tuple can be given as ___________
A) ?s,r,a,s'?
B) ?s',a,r,s?
C) ?s,a,r,s'?
D) None

ANSWER : C

Q-learning uses ____________ differences to estimate the value of Q*(s,a).


A) spatial
B) temporal
C) both
D) None

ANSWER : B

As an example, consider the process of boarding a train, in which the reward is measured by
the negative of the total time spent boarding (alternatively, the cost of boarding the train is
equal to the boarding time). One strategy is to enter the train door as soon as they open,
minimizing the initial wait time for yourself. If the train is crowded, however, then you will
have a slow entry after the initial action of entering the door as people are fighting you to
depart the train as you attempt to board. The total boarding time, or cost, is then: which is a
better option for the above scenario.
A) 0 seconds wait time + 15 seconds fight time
B) 5 second wait time + 0 second fight time.
C) Both
D) None
ANSWER : B

What are the advantages of biological neural networks (BNNs) compared to conventional Von
Neumann computers:

A) BNNs have the ability to learn from examples.


B) BNNs have a high degree of parallelism.
C) BNNs require a mathematical model of the problem.
D) BNNs can acquire knowledge by “trial and error”.
E) BNNs use a sequential algorithm to solve problems.

A) (i), (ii), (iii), (iv) and (v).


B) (i), (ii) and (iii).
C) (i), (ii) and (iv).
D) (i), (iii) and (iv).

ANSWER: C

Which of the following techniques can NOT be used for pre-processing the inputs to an
artificial neural network:
A) Normalization.
B) Winner-takes-all.
C) Fast Fourier Transform (FFT).
D) Principal component analysis (PCA).

ANSWER: B

Which of the following neural networks uses supervised learning?:


A) Self-organizing feature map (SOFM).
B) The Hopfield network.
C) Simple recurrent network (SRN).
D) All of the above answers.

ANSWER: C

Which of the following algorithms can be used to train a single-layer feedforward network
A) Hard competitive learning.
B) Soft competitive learning.
C) A genetic algorithm.
D) All of the above answers.

ANSWER: D

What is the credit assignment problem in a multi-layer feedforward network:


A) The problem of adjusting the weights for the output units.
B) The problem of adapting the neighbours of the winning unit.
C) The problem of defining an error function for linearly inseparable problems.
D) The problem of adjusting the weights for the hidden units.

ANSWER: D

Which of the following equations best describes the Generalized Delta Rule with
momentum?:
A) Owji(t + 1) = ?djxi
B) Owji(t + 1) = adjxi
C) Owji(t + 1) = ?djxi + aOwji(t)
D) Owji(t + 1) = ?djxi + adjxi(t)
Where wji(t) is the change to the weight from unit i to unit j at time t, ? is the learning rate,
a is the momentum coefficient, dj is the error term for unit j, and xi is the ith input to unit j.

ANSWER: C

One method for dealing with local minima is to use a committee of networks. What does this
mean:
A) Large number of different networks are trained and tested. The network with the lowest
sum squared error on a separate validation set is chosen as the best network.
B) Large number of different networks are trained and tested. All of the networks are used to
solve the real-world problem by taking the average output of all the networks.
C) Large number of different networks are trained and tested.
D)The networks are then combined together to make a network of networks, which is
biologically more realistic and computationally more powerful than a single network.

ANSWER: B

What is the most general type of decision region that can be formed by a feedforward
network with NO hidden layers?:
A) Convex decision regions – for example, the network can approximate any Boolean function.
B) Arbitrary decision regions – the network can approximate any function (the accuracy of the
approx- imation depends on the number of hidden units).
C) Decision regions separated by a line, plane or hyperplane.
D) None of the above answers.

ANSWER: C

Which of the following statements is the best description of overfitting:


A) The network becomes “specialized” and learns the training set too well.
B) The network can predict the correct outputs for test examples which lie outside the range
of the training examples.
C) The network does not contain enough adjustable parameters (e.g., hidden units) to find a
good approximation to the unknown function which generated the training data.
D) The network cannot predict the correct outputs.

ANSWER: A

Neural Networks:
A) Nerve cells in the brain are called neurons
B) The output from the neuron is called dendrite
C) One kind of neurons is called synapses
D) Learning takes place in the synapses

ANSWER: B

Multilayer perceptron network:


A) Is a neural network with several layers of nodes (or weights)
B) There are connections both between and within each layer
C) The number of units in each layer must be equal
D) Multiple layers of neurons does not allow for more complex
decision boundaries than a single layer

ANSWER: A

Backpropagation:
A) Is a learning algorithm for multilayer perceptron networks
B) Is applicable for testing
C) Is based on a gradient descent technique to maximize the mean square difference between
the desired and actual
outputs
D) Is also applicable to self-organizing feature maps

ANSWER: A

Weight updates in Back propagation


A) Usually, the weights are initially set to 0
B) Are proportional to the difference between the desired and
actual outputs
C) The weight change is also proportional to the input to the
weight layer
D) All of the above

ANSWER: D

F(x)=1 /(1 + e^-x)


A) f(x) is called a sigmoid function
B) It is beneficial because it does not limit the output value
C) It is called an activation function and such a function is used on every multilayer perceptron
output
D) Is called a hyperbolic function

ANSWER: A

What is the learning which addresses the question of how an autonomous agent that senses
and acts in the environment can learn to choose optimal actions to achieve its goals.

A) Supervised
B) Unsupervised
C) Semi- supervised
D) Reinforcement

ANSWER : D

Reinforcement algorithm is used for


A) Market-basket analysis
B) Diagonising a diabetic patient
C) To control a mobile robot
D) Identification of similar objects

ANSWER : C

To optimize operations in factories _________ learning is used.


A) Supervised
B) Reinforcement
C) Unsupervised
D) Semi- supervised

ANSWER : B

Each time the agent performs the action in its environment, a trainer may provide a ________
to indicate the desirability of the resulting state.
A) Reward or Penality
B) Reward only
C) Reward and Penality
D) Penality only

ANSWER : A

An agent will provide ________ reward when game is won and zero reward in all other states.
A) Positive
B) Negative
C) Either
D) Neither
ANSWER : A

____________ is one of the algorithm that can acquire optimal control strategies from
delayed rewards, even when the agent has no prior knowledge of the effects of its actions on
the environment.

A) Q learning
B) Supervised Learning
C) Deep learning
D) None of the above

ANSWER : A

Reinforcement learning algorithms are related to dynamic programming algorithms


frequently used to solve _____________ problems.
A) Classification
B) Optimization
C) Association
D) Clustering

ANSWER : B

TD-GAMMON program, used ______________ learning to become a world-class


backgammon player.
A) Q learning
B) Supervised Learning
C) Deep learning
D) Reinforcement

ANSWER : D

The task of the agent is to learn a target function that maps the __________ to ____________.

A) previous state, current state


B) previous state, optimal action
C) current state, optimal action
D) previous state, next state

ANSWER : B

As the trainer provides only a sequence of immediate reward values as the agent executes its
sequence of actions, the agent faces the problem of ______________________
A) Temporary credit assignment
B) Temporal credit assignment
C) Partially observable state
D) Exploration
ANSWER : B

In MDP, at each discrete time t, the agent senses the _________ state, chooses the current
action and performs it.
A) Previous
B) Current
C) Previous and current
D) Current and future

ANSWER : B

The goal state G is


A) current state
B) future state
C) succeeding state
D) absorbing state

ANSWER : D

In a simple grid-world evnironment diagram, each grid square represents _________, each
arrow represents a _________.
A) penalty, action
B) distinct state, distinct action
C) reward, action
D) Penalty,other action

ANSWER : B

Exploration means
A) gathering new information
B) optimizing the existing solutions
C) exploring unknown states and actions
D) both a & c
Answer: D

perceptron is
A) a single layer feed-forward neural network with preprocessing
B) an autoassociative neural network
C) a double layer autoassociative neural network
D)None
ANSWER : A

Which of the following is true?


A)On average, neural networks have higher computational rates than conventional
computers.
B)Neural networks learn by example.
C)Neural networks mimic the way the human brain works.
D) all of them are true
ANSWER : D

A single perceptron can be used to represent many boolean functions


A)TRUE
B)FALSE
ANSWER : A

Neural network learning methods provide a robust approach to approximating


A)real-valued functions
B)discrete-valued functions
C)vector-valued target functions
D)All of the above
ANSWER : D

Which of the following techniques can NOT be used for pre-processing the inputs to an
artificial neural network:
A) Normalization.
B) Winner-takes-all.
C) Fast Fourier Transform (FFT).
D) Principal component analysis (PCA).
E) Deleting outliers from the training set.

ANSWER: B

Which of the following neural networks uses supervised learning?:


A) Self-organizing feature map (SOFM).
B) The Hopfield network.
C) Simple recurrent network (SRN).
D) All of the above answers.
e) None of the above answers.

ANSWER: C

Which of the following algorithms can be used to train a single-layer feedforward network?:
A) Hard competitive learning.
B) Soft competitive learning.
C) A genetic algorithm.
D) All of the above answers.
E) None of the above answers.

ANSWER: D

What is Artificial intelligence


A) Putting your intelligence into Computer
B) Programming with your own intelligence
C) Making a Machine intelligent
D) Playing a Game
ANSWER: C

Which is the best way to go for Game playing problem


A) Linear approach
B) Heuristic approach
C) Random approach
D) Optimal approach
ANSWER: B

Which is not the commonly used programming language for AI


A) PROLOG
B) Java
C) LISP
D) Perl
ANSWER: D

In an Unsupervised learning
A) Specific output values are given
B) Specific output values are not given
C) No specific Inputs are given
D) Both inputs and outputs are given
ANSWER: B

A perceptron is a --------------------------------.
A) Feed-forward neural network
B) Back-propagation alogorithm
C) Back-tracking algorithm
D) Feed Forward-backward algorithm
ANSWER: A

Neural Networks are complex -----------------------with many parameters


A) Linear Functions
B) Nonlinear Functions
C) Discrete Functions
D) Exponential Functions
ANSWER: B

What is the goal of artificial intelligence


A) To solve real-world problems
B) To solve artificial problems
C) To explain various sorts of intelligence
D) To extract scientific causes
ANSWER: C

Machine learning is
A) The autonomous acquisition of knowledge through the use of computer programs
B) The autonomous acquisition of knowledge through the use of manual programs
C) The selective acquisition of knowledge through the use of computer programs
D) The selective acquisition of knowledge through the use of manual programs
ANSWER: A

Factors which affect the performance of learner system does not include
A) Representation scheme used
B) Training scenario
C) Type of feedback
D) Good data structures
ANSWER: D

Perception involves
A) Sights, sounds, smell and touch
B) Hitting
C) Boxing
D) Dancing
ANSWER: A

A perceptron is
A) a single layer feed-forward neural network with preprocessing
B) an autoassociative neural network
C) a double layer autoassociative neural network
D)None
ANSWER : A

Which of the following is true?


A)On average, neural networks have higher computational rates than conventional
computers.
B)Neural networks learn by example.
C)Neural networks mimic the way the human brain works.
D) all of them are true
ANSWER : D

A single perceptron can be used to represent many boolean functions


A)TRUE
B)FALSE
ANSWER : A

Neural network learning methods provide a robust approach to approximating


A)real-valued functions
B)discrete-valued functions
C)vector-valued target functions
D)All of the above
ANSWER : D
Artificial neural network used for
A)Pattern Recognition
B)Classification
C)Clustering
D) All of these
ANSWER : D

In artificial Neural Network interconnected processing elements are called


A)weights
B)nodes or neurons
C) axons
D)Soma
ANSWER : B

Each connection link in ANN is associated with ________ which has information about the
input signal.
A)neurons
B)weights
C)bias
D)activation function
ANSWER : B

A Neural Network can answer


A)For Loop questions
B)what-if questions
C)IF-The-Else Analysis Questions
D)None of these
ANSWER : B

A 4 input neuron has weights 1, 2, 3 and 4. The transfer function is linear .The inputs are 4,
10, 5 and 20 respectively. The output will be
A)238
B)76
C)119
D)123
ANSWER : C

Which of the following is not the promise of artificial neural network?


A) It can explain result
B)It can survive the failure of some nodes
C)It has inherent parallelism
D)It can handle noise
ANSWER : A

A perceptron adds up all the weighted inputs it receives, and if it exceeds a certain value, it
outputs a 1, otherwise it just outputs a 0.
A)True
B)False
C)Sometimes -it can also output intermediate values as well
D)Can’t say
ANSWER : A

Perceptron can be viewed as representing a hyperplane decision surface in the n-dimensional


space
A)True
B)False
ANSWER : A

Example for linearly nonseparable training examples are based on


A)AND function
B)OR function
C)XOR function
D)NOT function
ANSWER : C

Some of the examples of popular weight determining algorithms are


A)Delta rule
B)Perceptron rule
C)Stochastic gradient descent
D)All of the above
ANSWER : D

Convergence fails in --------learning procedures,when training examples are not linearly


separable
A)Delta rule
B)Stochastic gradient descent rule
C) Perceptron rule
D)Gradient descent rule
ANSWER : C

Sequence the flow of perceptron rule


i)Choose random weights
ii)Modifying the perceptron weights on misclassification
iii) iteratively apply the perceptron
iv)iterating through the training examples for proper classification
A) iii , ii, iv, i
B) i,iii ,ii,iv
C)i,ii,iii,iv
D)iv,iii,ii,i
ANSWER : B

The key idea behind the delta rule is to use -------to search the hypothesis space
A)Stochastic gradient descent
B)Linear programming
C)Gradient descent
D) Both A and B
ANSWER : C

The delta training rule is best understood by considering the task of training an
A) thresholded perceptron
B)Unthresholded perceptron
C) randomised perceptron
D) None of the above
ANSWER : B

In gradient descent algorithm steepest descent along the error surface can be found by ------
-- with respect to each component of the input vector
A)computing the derivative of E(error)
B)computing the derivative of E(error)
C)computing the treshold of E(error)
D)both B and C
ANSWER : A

Gradient descent can be applied when


A)hypothesis space contains weights in a linear unit
B)the error can be differentiated with respect weights in a linear unit
C) both A and B
D) None of the above
ANSWER : C

Practical difficulties in applying gradient descent are


A)Converging to a local minimum can sometimes be quite slow
B)Un-guaranteed procedure of finding the global minimum in the presence of multiple local
minimum
C)Both A and B
D)Only A
ANSWER : C

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy