0% found this document useful (0 votes)
8 views17 pages

Capstone Project

The document consists of multiple-choice questions (MCQs) and assertions related to Capstone Projects, AI project cycles, design thinking, problem decomposition, model validation, and metrics of model quality. It covers essential concepts in data science, machine learning, and AI, including the purpose of Capstone Projects, the importance of data gathering, and various evaluation metrics. The document also includes assertions with reasoning to assess understanding of these concepts.

Uploaded by

Kasinath A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

Capstone Project

The document consists of multiple-choice questions (MCQs) and assertions related to Capstone Projects, AI project cycles, design thinking, problem decomposition, model validation, and metrics of model quality. It covers essential concepts in data science, machine learning, and AI, including the purpose of Capstone Projects, the importance of data gathering, and various evaluation metrics. The document also includes assertions with reasoning to assess understanding of these concepts.

Uploaded by

Kasinath A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Capstone Project

MCQs:
1. What is the main purpose of a Capstone Project?
o A) To demonstrate theoretical knowledge.
o B) To complete a thesis paper.
o C) To integrate all knowledge gained through a comprehensive project.
o D) To learn about different industries.
o Answer: C
2. Which of the following is not an objective of a Capstone Project?
o A) Solving real-world problems.
o B) Expressing solutions in technical terms.
o C) Selecting appropriate algorithms for a problem.
o D) Learning teamwork.
o Answer: B
3. Which AI project involves predicting stock prices?
o A) Movie Ticket Price Predictor
o B) Stock Prices Predictor
o C) Sentiment Analyzer
o D) Student Results Predictor
o Answer: B
4. Which AI model is typically used for classification?
o A) Regression
o B) Clustering
o C) Classification
o D) Anomaly Detection
o Answer: C
5. What is the first step in the AI project cycle?
o A) Model construction
o B) Data gathering
o C) Problem definition
o D) Evaluation & refinements
o Answer: C
6. Which step is critical in determining whether AI techniques are applicable to a problem?
o A) Gathering data
o B) Identifying a pattern in data
o C) Deploying the model
o D) Selecting the right algorithm
o Answer: B
Design Thinking and Problem Decomposition
7. Which of the following is not a stage of Design Thinking?
o A) Empathize
o B) Define
o C) Deploy
o D) Prototype
o Answer: C
8. Which of the following is the correct sequence for breaking down a problem?
o A) Start coding, then identify the problem.
o B) Restate the problem, decompose into large pieces, break those pieces down.
o C) Decompose large pieces first, then state the problem.
o D) Prototype the solution first.
o Answer: B
9. Which is an example of problem decomposition?
o A) Gathering data from sensors
o B) Breaking down app development into multiple tasks
o C) Running machine learning models
o D) Collecting user feedback
o Answer: B
10.Which concept involves the breakdown of time series data into trend, seasonality, and
noise components?
• A) Time Series Forecasting
• B) Design Thinking
• C) Problem Decomposition
• D) Time Series Decomposition
• Answer: D
AI Model Construction and Analytics
11.Which is the first step in any AI or machine learning project?
• A) Data modeling
• B) Data collection
• C) Business understanding
• D) Cross-validation
• Answer: C
12.Which is the foundational methodology for data science?
• A) Data mining
• B) CRISP-DM
• C) Agile
• D) SDLC
• Answer: B
13.What does the second stage of data science methodology involve?
• A) Business understanding
• B) Data collection
• C) Defining the analytic approach
• D) Model deployment
• Answer: C
14.Which of these approaches would you use for showing relationships between variables?
• A) Predictive approach
• B) Descriptive approach
• C) Classification approach
• D) Regression
• Answer: B
15.When might a predictive model be used?
• A) To explain historical data
• B) To show relationships between data
• C) To predict future outcomes
• D) To cluster similar data points
• Answer: C
Data Requirements and Modeling
16.What question should be asked first in a data project?
• A) What is the business outcome?
• B) How will the data be collected?
• C) What data is needed?
• D) What algorithm will be used?
• Answer: A
17.Which dataset is commonly used for predicting house prices?
• A) Airline Passenger Dataset
• B) Forestfires Dataset
• C) Housing Dataset
• D) MNIST Dataset
• Answer: C
18.What does a training set do in predictive modeling?
• A) Predicts future data
• B) Tests the model
• C) Fits the model
• D) Validates the model
• Answer: C
19.What is a descriptive model used for?
• A) Prediction of new data
• B) Describing relationships in historical data
• C) Anomaly detection
• D) Identifying missing data
• Answer: B
20.Which concept refers to adjusting models using new data to improve their accuracy?
• A) Refinement
• B) Validation
• C) Cross-validation
• D) Feature selection
• Answer: A
Model Validation
21.What does the train-test split method achieve?
• A) Collecting the data
• B) Evaluating model performance
• C) Data pre-processing
• D) Model deployment
• Answer: B
22.What percentage is commonly used for training data in a train-test split?
• A) 80%
• B) 20%
• C) 67%
• D) 50%
• Answer: A
23.What does cross-validation help achieve?
• A) Faster training
• B) Model deployment
• C) Reliable performance measures
• D) Model transformation
• Answer: C
24.In a cross-validation process, how many subsets are generally created in a 5-fold cross-
validation?
• A) 2
• B) 5
• C) 10
• D) 3
• Answer: B
25.When is cross-validation more beneficial than train-test split?
• A) For large datasets
• B) For datasets with limited rows
• C) When doing unsupervised learning
• D) For high computational costs
• Answer: B
Metrics of Model Quality
26.Which of the following is a commonly used metric for regression models?
• A) Accuracy
• B) Precision
• C) Recall
• D) Root Mean Squared Error (RMSE)
• Answer: D
27.Which metric is most suitable for classification tasks?
• A) MSE
• B) Accuracy
• C) RMSE
• D) Noise ratio
• Answer: B
28.What is the objective of minimizing the loss function?
• A) Maximizing error
• B) Improving model predictions
• C) Increasing data complexity
• D) Creating a noise-free dataset
• Answer: B
29.Which of the following is used to calculate RMSE?
• A) Mean of residuals
• B) Sum of absolute errors
• C) Square root of the mean of squared errors
• D) Mean of absolute differences
• Answer: C
30.What does a low RMSE indicate?
• A) Poor model performance
• B) High variance in predictions
• C) Accurate predictions
• D) Overfitting
• Answer: C
Advanced Topics and Applications
31.Which algorithm is used in the example of the Airline Passenger Dataset?
• A) Decision Tree
• B) Random Forest
• C) Seasonal Decomposition
• D) Support Vector Machine
• Answer: C
32.What is MSE sensitive to?
• A) Outliers
• B) Missing data
• C) Noisy data
• D) Small datasets
• Answer: A
33.What is the purpose of gradient descent in machine learning?
• A) Maximizing the loss function
• B) Minimizing the objective function
• C) Identifying data clusters
• D) Reducing dataset size
• Answer: B
34.Which value represents the best prediction in MSE?
• A) The highest value
• B) The lowest value
• C) The mean of predictions
• D) The median of predictions
• Answer: B
35.What type of learning involves algorithms like regression or classification?
• A) Supervised learning
• B) Unsupervised learning
• C) Reinforcement learning
• D) Semi-supervised learning
• Answer: A
36.Which of the following is an example of a supervised learning algorithm?
• A) K-means clustering
• B) Linear regression
• C) Principal component analysis
• D) Autoencoders
• Answer: B
37.What is the purpose of a training set in machine learning?
• A) To evaluate the model
• B) To make predictions
• C) To fit the machine learning model
• D) To store labels
• Answer: C
38.Which algorithm is most suitable for a regression problem?
• A) Decision tree
• B) Linear regression
• C) K-nearest neighbors
• D) Naive Bayes
• Answer: B
39.In a recommendation system, which method is typically used to suggest new items?
• A) Clustering
• B) Regression
• C) Collaborative filtering
• D) Anomaly detection
• Answer: C
40.What does feature selection involve?
• A) Choosing the algorithm
• B) Selecting important variables in a dataset
• C) Testing the model
• D) Tuning hyperparameters
• Answer: B
41.Which of the following would be considered a feature in a dataset?
• A) The target label
• B) An algorithm
• C) A variable used for prediction
• D) The test set
• Answer: C
42.What is the role of data normalization in model training?
• A) Reducing the number of features
• B) Scaling data to a standard range
• C) Increasing dataset size
• D) Adding noise to the data
• Answer: B
Model Validation Techniques
43.Which is the most reliable method to evaluate model performance on smaller datasets?
• A) Simple train-test split
• B) Leave-one-out cross-validation
• C) Randomized testing
• D) Bootstrap aggregation
• Answer: B
44.Cross-validation is typically used to:
• A) Build the model
• B) Split data into train and test sets
• C) Test the model with multiple subsets
• D) Apply unsupervised learning
• Answer: C
45.What is one major drawback of cross-validation compared to train-test split?
• A) It uses less data.
• B) It takes more time and computational resources.
• C) It produces less accurate results.
• D) It can only be applied to classification problems.
• Answer: B
46.Which validation method involves using every data point for testing at least once?
• A) K-fold cross-validation
• B) Simple validation
• C) Random split
• D) Hold-out validation
• Answer: A
47.What is the primary advantage of using cross-validation?
• A) Requires fewer computational resources
• B) More accurate representation of model performance
• C) Faster training of the model
• D) Higher accuracy for large datasets
• Answer: B
48.What is the goal of hyperparameter tuning?
• A) Choosing the right model
• B) Optimizing algorithm performance
• C) Collecting more data
• D) Scaling the data
• Answer: B
Metrics of Model Quality
49.What does MAPE stand for?
• A) Mean Absolute Prediction Error
• B) Mean Absolute Percentage Error
• C) Mean Adjusted Prediction Error
• D) Minimum Absolute Prediction Estimate
• Answer: B
50.Which error metric penalizes large errors more than small errors?
• A) RMSE
• B) MSE
• C) Accuracy
• D) Precision
• Answer: B
51.Which error metric would you use to compare different regression models?
• A) Classification accuracy
• B) RMSE
• C) ROC-AUC score
• D) F1-Score
• Answer: B
52.Which of the following is most important when evaluating a model’s accuracy on
unseen data?
• A) Precision
• B) Validation data
• C) Recall
• D) Feature engineering
• Answer: B
53.Which metric is used to evaluate classification tasks in binary classification?
• A) Precision and recall
• B) RMSE
• C) MSE
• D) MAE
• Answer: A
54.Which evaluation metric balances precision and recall in a classification problem?
• A) F1-Score
• B) Accuracy
• C) RMSE
• D) Cross-validation
• Answer: A
55.What does MAE stand for in machine learning?
• A) Model Accuracy Estimate
• B) Mean Absolute Error
• C) Maximum Accuracy Estimate
• D) Minimum Adjustment Error
• Answer: B
56.Which error metric is less sensitive to outliers in regression problems?
• A) RMSE
• B) MAE
• C) MSE
• D) Cross-entropy
• Answer: B
57.Which evaluation metric is best for highly imbalanced classification datasets?
• A) Accuracy
• B) F1-Score
• C) RMSE
• D) MAE
• Answer: B
Practical Applications and AI Techniques
58.Which AI project involves recognizing human activities using smartphone data?
• A) Stock Prices Predictor
• B) Human Activity Recognition
• C) Student Results Predictor
• D) Sentiment Analysis
• Answer: B
59.Which of the following best describes anomaly detection?
• A) Grouping similar data points
• B) Identifying unusual patterns in data
• C) Predicting continuous outcomes
• D) Labeling data based on features
• Answer: B
60.In AI, what is a common use of clustering algorithms?
• A) Predicting future outcomes
• B) Grouping similar data points without labels
• C) Detecting anomalies
• D) Improving model accuracy
• Answer: B
1. Assertion (A): The Capstone Project integrates all learning from an academic
program.
Reason (R): It focuses solely on individual work rather than collaboration.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: C

2. Assertion (A): Data gathering is a critical step in an AI project cycle.


Reason (R): Without proper data, the AI model cannot be trained effectively.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: A

3. Assertion (A): AI development is always suitable for every type of problem.


Reason (R): AI techniques are applied when a pattern exists in the data.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: D

4. Assertion (A): Clustering is used to categorize data into predefined groups.


Reason (R): Clustering helps in dividing the data based on a similarity metric without
predefined labels.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: D

5. Assertion (A): Problem decomposition helps simplify complex problems.


Reason (R): Breaking a problem into smaller pieces allows for easier coding and debugging.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: A

6. Assertion (A): Time series decomposition involves breaking a series into level,
trend, and seasonality.
Reason (R): Decomposing time series data helps identify underlying patterns for better
forecasting.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: A

7. Assertion (A): Predictive modeling is used to find relationships between


variables.
Reason (R): Descriptive models are employed to predict future outcomes.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: C

8. Assertion (A): Data scientists use training sets to evaluate model performance.
Reason (R): Test sets are used to adjust models after training is complete.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: C

9. Assertion (A): Cross-validation ensures more reliable model evaluation than a


train-test split.
Reason (R): Cross-validation evaluates models using different data folds, making the process
more computationally efficient.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: C

10. Assertion (A): RMSE (Root Mean Squared Error) is a commonly used metric for
evaluating regression models.
Reason (R): RMSE penalizes larger errors more significantly than smaller errors, making it
sensitive to outliers.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: A

11. Assertion (A): The final stage in model evaluation is deployment.


Reason (R): Deployment occurs after thorough testing, validation, and refinement of the
model.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: A

12. Assertion (A): MSE (Mean Squared Error) penalizes large errors more severely
than RMSE.
Reason (R): MSE focuses on the average squared difference between predicted and actual
values.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: B

13. Assertion (A): A classification problem is always solved using regression


techniques.
Reason (R): Classification involves predicting a continuous outcome, while regression focuses
on categorization.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: D

14. Assertion (A): Cross-validation is more appropriate for smaller datasets than a
train-test split.
Reason (R): Larger datasets require more complex evaluation strategies than cross-validation.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: A

15. Assertion (A): Data preprocessing is not necessary if the dataset is large.
Reason (R): Large datasets inherently contain all necessary information and require no
adjustments.
 A) Both A and R are true, and R is the correct explanation of A.
 B) Both A and R are true, but R is not the correct explanation of A.
 C) A is true, but R is false.
 D) A is false, but R is true.
Answer: D
CASE STUDY BASED QUESTIONS:
1. A team of students is working on a stock price prediction model as part of their
Capstone Project. They are facing issues because the stock prices show a lot of
volatility, and the patterns are not clear. The team is unsure about how to proceed with
building their AI model.
Question: What should the team do first before applying any AI model?
 Answer: The team should first check if there is any identifiable pattern in the stock
price data. If no pattern exists, AI techniques should not be applied. However, if a
pattern is present, the team can proceed with data gathering and feature definition to
build an AI model for stock price prediction.

2. A student team is tasked with building a sentiment analyzer that classifies text as positive,
negative, or neutral. They collected a dataset of tweets but discovered that the data includes
irrelevant information like URLs and emojis.
Question: How should the team handle the data before building their AI model?
 Answer: The team should preprocess the data by removing irrelevant information like
URLs, emojis, and special characters. They should clean the dataset to focus on the text
content, which will help in improving the accuracy of the sentiment analysis model.

3. A team is working on a project that predicts airline passenger traffic. They noticed that the
number of passengers varies significantly by season, and they want to understand the trends
better.
Question: What technique should the team use to better understand the trends in their data?
 Answer: The team should use time series decomposition, which involves breaking
down the data into components such as level, trend, seasonality, and noise. This will
help them identify patterns in the airline passenger data more clearly.

4. A group of students is working on predicting house prices. They split their dataset into
training (80%) and testing (20%) subsets. However, they are unsure if their model generalizes
well on unseen data.
Question: What can the team do to better assess the quality of their model?
 Answer: The team should use cross-validation instead of a simple train-test split.
Cross-validation allows the model to be tested on multiple subsets of data, providing a
more reliable evaluation of the model’s performance.

5. A team is working on a project to address the issue of crop yield prediction in agriculture.
They collected a large dataset but are unsure which AI model to use for this type of prediction.
Question: What type of model should the team consider for predicting crop yields, and why?
 Answer: The team should consider using a regression model because crop yield
prediction is a continuous variable problem. Regression models are suitable for
predicting “how much” or “how many” based on the input data.

6. A group is predicting brain weights based on head size using linear regression. After
running their model, they calculated an RMSE of 73.
Question: How should the team interpret this RMSE value, and what should be their next
step?
 Answer: An RMSE of 73 indicates the average error in their predictions. A good model
should have an RMSE significantly lower than 180, so their model’s performance is
acceptable. However, if the team wants to improve the model, they could consider
tweaking the hyperparameters or refining the feature set.

7. A student team is developing a recommendation system for improving educational


resources in schools. They want to recommend learning materials based on students’ learning
habits.
Question: Which AI technique should the team use to build the recommendation system, and
why?
 Answer: The team should use a recommendation model, which can suggest learning
materials based on patterns in student behavior. Recommendation systems work by
analyzing previous choices or habits and predicting what a user may prefer next.

8. A team is working on predicting movie ticket prices based on factors such as location,
movie type, and time of day. They have used a dataset but are not sure if their model is
performing well.
Question: What metrics should the team use to evaluate their model’s performance?
 Answer: The team should use regression evaluation metrics such as Mean Squared
Error (MSE) and Root Mean Squared Error (RMSE). These metrics will give them insights
into how accurate their predictions are by comparing predicted values to actual values.

9. A team is tasked with using AI to predict patient recovery times based on medical history
and treatment data. However, the data has missing values.
Question: What steps should the team take to handle the missing data before applying their
AI model?
 Answer: The team should use data imputation techniques, such as replacing missing
values with the mean or median of the dataset, or using more advanced techniques like
K-nearest neighbors imputation. This will ensure that their model has complete data to
work with.

10. A group of students is working on a project to classify human activities using smartphone
sensors (like accelerometers). The data includes features such as time, accelerometer
readings, and gyroscope readings.
Question: What type of AI model should the team use for this classification task?
 Answer: The team should use a classification model since their task involves
categorizing data into different activities (e.g., walking, running, sitting). Models like
Decision Trees, Random Forest, or Neural Networks would be suitable for this type of
problem.
IMPORTANT QUESTION-ANSWERS:
1. What is a Capstone Project in the context of AI education?
 Answer: A Capstone Project is the final project of an academic program where students
integrate all their learning and apply it to solve real-world problems. It involves
teamwork, discussion, research, and hands-on activities.

2. What are the key steps in the AI Project Cycle?


 Answer: The six key steps are: 1) Problem definition, 2) Data gathering, 3) Feature
definition, 4) AI model construction, 5) Evaluation and refinements, 6) Deployment.

3. Why is “problem definition” important in an AI project?


 Answer: Problem definition is crucial because it sets the direction for the entire AI
project. It involves understanding if there is a pattern in the data, which is fundamental
for deciding whether AI techniques should be applied.

4. What types of questions does predictive analysis in AI typically answer?


 Answer: Predictive analysis answers questions like: 1) Which category (classification)?
2) How much or how many (regression)? 3) Which group (clustering)? 4) Is this unusual
(anomaly detection)? 5) Which option should be taken (recommendation)?.

5. What is Design Thinking in AI problem-solving?


 Answer: Design Thinking is a solution-based approach to problem-solving, which
involves five stages: Empathize, Define, Ideate, Prototype, and Test. It is useful in
tackling complex, ill-defined problems.

6. What is time series decomposition?


 Answer: Time series decomposition involves breaking down a time series into
components such as level, trend, seasonality, and noise to better understand the data
for analysis and forecasting.

7. What is the main advantage of problem decomposition in computational tasks?


 Answer: Problem decomposition breaks complex problems into smaller, manageable
pieces, making coding, debugging, and problem-solving more efficient.

8. Why is data gathering essential in an AI project?


 Answer: Data gathering is essential because AI models need relevant and accurate
data to train on. Without proper data, the model cannot produce valid predictions.
9. What is RMSE, and why is it important in AI models?
 Answer: RMSE (Root Mean Squared Error) measures the accuracy of an AI model by
calculating the square root of the average squared differences between predicted and
actual values. It is important because it penalizes larger errors more heavily.

10. What is cross-validation, and how does it improve model performance


evaluation?
 Answer: Cross-validation is a technique where the dataset is divided into several folds,
and the model is trained and tested on different subsets of data. It provides a more
reliable measure of model performance compared to a simple train-test split.

11. What is the purpose of a recommendation model in AI?


 Answer: A recommendation model suggests items or actions to users based on
patterns in their behavior or preferences. It is commonly used in applications like e-
commerce and streaming services.

12. What are the key components of time series data?


 Answer: The key components of time series data are: 1) Level (average value), 2)
Trend (increasing or decreasing pattern), 3) Seasonality (repeating cycles), and 4) Noise
(random variation).

13. Why is feature definition important in AI modeling?


 Answer: Feature definition is crucial because it involves selecting the most relevant
attributes or variables that will be used to train the model. Proper feature selection can
significantly improve model accuracy.

14. What is the goal of AI model construction?


 Answer: The goal of AI model construction is to build an algorithm that can learn from
the data and make accurate predictions or decisions based on the problem being
addressed.

15. What is the difference between regression and classification in AI?


 Answer: Regression predicts continuous values (e.g., house prices), while classification
predicts discrete categories (e.g., spam or not spam).

16. How is data preprocessed for AI projects?


 Answer: Data preprocessing involves cleaning the data (removing irrelevant or
incorrect data), handling missing values, normalizing or standardizing data, and
transforming it into a format suitable for modeling.
17. What is MSE, and how is it different from RMSE?
 Answer: MSE (Mean Squared Error) calculates the average of the squared differences
between predicted and actual values. RMSE is the square root of MSE. RMSE is
preferred because it provides a more interpretable measure by being in the same units
as the target variable.

18. Why is model validation important in AI?


 Answer: Model validation ensures that the AI model generalizes well to new, unseen
data and helps prevent overfitting, where the model performs well on training data but
poorly on test data.

19. What is the purpose of using a training dataset in AI?


 Answer: The training dataset is used to teach the AI model, allowing it to learn
patterns in the data. The model uses this data to make predictions and adjust its
parameters during training.

20. What is an anomaly detection model used for in AI?


 Answer: Anomaly detection models identify data points that deviate significantly from
the norm, which can be useful in detecting fraud, equipment malfunctions, or unusual
behavior.

21. What is the role of a test dataset in AI model development?


 Answer: The test dataset is used to evaluate the performance of the AI model after
training. It contains new data that the model has not seen during training, providing an
objective measure of how well the model generalizes.

22. Why is the business understanding stage crucial in data science projects?
 Answer: The business understanding stage is crucial because it defines the problem,
objectives, and success criteria from a business perspective, ensuring the solution
aligns with business goals.

23. What is the significance of model deployment in an AI project?


 Answer: Model deployment is the final stage in an AI project where the trained model
is integrated into a production environment to make real-time predictions or decisions.

24. How does the “empathize” stage in Design Thinking help in AI projects?
 Answer: In the empathize stage, developers focus on understanding the user’s needs
and challenges, which helps in designing AI solutions that are user-centric and address
real-world problems.

25. What is the purpose of using a prototype in the Design Thinking process?
 Answer: The prototype is a preliminary model used to test and explore ideas before
final implementation. It helps in identifying potential issues and refining solutions early
in the development process.

26. How is the concept of clustering applied in AI?


 Answer: Clustering is used to group data points based on their similarities without
predefined labels. It is commonly used in customer segmentation, image recognition,
and market analysis.

27. What is the primary objective of feature engineering in AI?


 Answer: The primary objective of feature engineering is to transform raw data into
features that better represent the underlying problem, thereby improving the
performance of AI models.

28. Why is it important to avoid overfitting in AI models?


 Answer: Overfitting occurs when a model learns the noise or random fluctuations in
the training data rather than the actual pattern. This leads to poor performance on new
data, making it essential to avoid overfitting for reliable predictions.

29. What is the role of gradient descent in machine learning?


 Answer: Gradient descent is an optimization algorithm used to minimize the loss
function in machine learning models. It iteratively adjusts the model parameters to
reduce prediction error.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy