0% found this document useful (0 votes)

11 views46 pages

AIML-HC Mod 03

Uploaded by

kushaan.bhat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views46 pages

AIML-HC Mod 03

Uploaded by

kushaan.bhat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

UNIT III

EVALUATING LEARNING
FOR INTELLIGENCE
INTRODUCTION

• The most laborious tasks within a machine learning project are identifying the appropriate model and
engineering features, which make a substantial difference to the output of the model.
• In fact, the features chosen can often have more impact on the quality of a model compared to the
model choice itself.
• Therefore, it is important to evaluate the learning algorithm that will determine the model’s intelligence
to predict the output of an unknown sample.
• This is usually done using various metrics, which are discussed further.
MODEL DEVELOPMENT AND WORKFLOW

• To successfully deploy a machine learning model, there are several stages of development and
evaluation that take place, as illustrated in Figure below
• The first stage is the prototype phase.
• During this phase, a prototype is created through testing various models on historical data to determine
the best model.
• Hyperparameter tuning, as discussed later in this chapter, is a requirement of model training.
• Once the best prototype model is chosen, the model is tested and validated.
• Validating a model requires splitting datasets into training, testing, and validation sets as discussed in
Chapter 3.
• Consider the fact that there is no such thing as a random dataset and instead the randomness applies to
the splitting of the dataset.
• Be aware of biases that may appear in the data.
• Once the model has been successfully validated, it is deployed to production.
• The model is then usually evaluated by one (or several) performance metrics.
• There are two ways of evaluating a machine learning model: offline evaluation and online (or live)
evaluation.
WHY ARE THERE TWO APPROACHES TO
EVALUATING A MODEL?
• A deployed machine learning model consumes data from two sources: historical data (or the data that
is used as the experience to be learned from) and live data.
• Many machine learning models assume stationary distribution data—that the data distribution is
constant over time.
• However, this is atypical of real life, as distributions of data often change over time—known as a
distribution shift.
• For instance, consider a system that predicts the side effects of medications to patients based on their
health profile. Medication side effects may change based on population.
• Factors such as ethnicity, disease profile, territory, medication popularity, and new medications.
• The distribution of relevant side effects based on patient data can vary quickly over time, and hence it is
essential for a model to detect a shift in distribution and accordingly evolve the model.
• The method in which this is typically assessed is through the performance of the model based on live
data, evaluated through the validation metric used in the testing and validation of the model on
historical data.
• Model performance that is similar to or within a threshold of permissibility when evaluated on live data
is deemed as a model that continues to fit the data.
• Degradation of model performance indicates that the model does not fit the data and requires
retraining.
• Offline evaluation measures the model based on metrics learned and evaluated from the historical,
stationary, distributed dataset.
• Metrics such as accuracy and precision-recall are typically used within the offline training stage.
• Offline evaluation techniques include the hold-back method and n-fold cross-validation.
• Online evaluation refers to the evaluation of metrics once the model is deployed.
• The key takeaway is that these metrics may differ from the metrics used to evaluate performance when
the model is deployed live.
• For instance, a model that is learning on new pharmacological treatments may seek to be as precise as
possible in training and validation; but when placed online, it may need to consider business goals such
as budget or treatment value when deployed.
• Digital age, can support multivariate testing to understand best-performing models.
• Feedback loops are key to ensuring systems are performing as intended and help to understand the model in
the context of use better.
• This can be performed by a human agent or automated through a contextually intelligent agent or users of the
model.
• It is important that the evaluation of a machine learning model is based on a statistically independent dataset
and not on the dataset it is trained on.
• This is because the evaluation of the training dataset is optimistic about the model’s true performance as it
adapts to the dataset.
• By evaluating the model with previously unseen data, there is a better estimate of the generalization error.
• New data can be hard to find; hence it is important to be able to have new, unseen data from the current
dataset.
• Methods such as n-fold cross-validation discussed in Chapter 3 are useful techniques for this purpose. Often the
data used is more important than the algorithm choice; and the better the features used, the greater the
performance of the model.
• The evaluation metrics discussed can be found in the metrics package for R and scikit-learn for Python.
EVALUATION METRICS

• There are a plethora of evaluation metrics for machine learning problems.

• Metrics exist for the variety of machine learning tasks—classification,
• regression, clustering, association rule mining, NLP, and so on.
CLASSIFICATION

• Classification problems seek to give a label or classification to an input.

• There are several methods by which to measure performance, including
• accuracy, precision-recall, confusion matrices, log-loss (logarithmic loss),
• and AUC (area under the curve).
ACCURACY

• Accuracy is the simplest technique used in identifying whether a model

• is making correct predictions. It is calculated as a percentage of correct
• prediction over the total predictions made.
• Accuracy = number of correct predictions/number of total
• predictions
CONFUSION MATRIX

• Accuracy is a general metric that does not consider the division between classes.
• Therefore, it does not consider misclassification or the associated penalty with misclassification. For
instance, a medical misdiagnosis that is a false positive (e.g., take a patient diagnosed with breast
cancer when they do not have it) has substantially different consequences compared to a false
negative, whereby a patient is told that they do not have breast cancer when in fact they do. A
confusion matrix breaks down the correct and incorrect classifications made by the model and
attributes them to the appropriate label.
• True positive: Where the actual class is yes, and the value of the predicted class is also yes.
• False positive: Actual class is no, and predicted class is yes
• True negative: The value of the actual class is no, and the value of the predicted class is no
• False negative: When the actual class value is yes, but predicted class is no
• Take an example whereby a model predicts whether a patient has breast cancer or not based on 50
example inputs from the test dataset with an equal distribution between positive and negative labeled
examples.
• The confusion matrix would be as in Table 5-1.
• From the confusion matrix, it is determined that the positive class has greater accuracy than the
negative class.
• The accuracy of the positive classification is 20/25 = 80%.
• The negative class has an accuracy of 10/25 = 40%. Both metrics differ from the overall accuracy of the
model, which would be determined as (20 + 10)/50 = 60%.
• It is apparent how a confusion matrix adds more detail to the overall accuracy of a machine learning
model.
• As a result, accuracy can be rewritten as the following: Accuracy = (correctly predicted
observation)/(total observation) =
• (TP + TN)/(TP + TN + FP + FN)
PER-CLASS ACCURACY

• Per-class accuracy is an extension of accuracy that takes into account the accuracy of each class. As a
result, the preceding example has a per-class accuracy of (80% + 40%)/2 = 60%.
• Per-class accuracy is usefu in distorted problems where there are a larger number of examples within
one particular class compared to another.
• The class with greater examples dominates the calculation, and therefore accuracy alone may not
suffice for the nature of your model; thus it is useful to evaluate per-class accuracy also.
LOGARITHMIC LOSS

• Logarithmic loss (or log-loss for short) is used for problems where a continuous probability is predicted
rather than a class label.
• Log-loss provides a probabilistic measure of the confidence of the accuracy and considers the entropy
between the distribution of true labels and predictions.
• For a binary classification problem, the logarithmic loss would be calculated as follows:

• Where Pi is the probability of the ith data point belonging to a class and yi the true label (either 0 or 1).
AREA UNDER THE CURVE (AUC)
• The AUC plots the rate of true positives to the rate of false positives.
• The AUC enables the visualization of the sensitivity and specificity of the classifier.
• It highlights how many correct positive classifications can be gained allowing for false positives.
• The curve is known as the receiver operating characteristic curve, or ROC as shown in Figure 5-2.
• A high AUC or greater space underneath the curve is good, and a smaller area under the curve (or less
space under the curve) is undesirable.
• In Figure 5-2, test A has better AUC as compared to test B, as the AUC for test A is larger than for test B.
• The ROC visualizes the trade-off between specificity and sensitivity of the model.
PRECISION, RECALL, SPECIFICITY, AND F-MEASURE
• Precision and recall are two metrics used together to evaluate model performance.
• Precision evaluates how many items are truly relevant compared to the total number of items correctly
classified.
• Recall evaluates how many items are predicted to be relevant by the model from the items that are
relevant.
• Precision: (correctly predicted Positive)/(total predicted Positive) = TP/TP + FP
• Recall: (correctly predicted Positive)/(total correct Positive observation) = TP/TP + FN
• Specificity refers to how well the model performs at returning incorrect
classifications and is calculated as in Figure 5-3.
• Specificity: (correctly predicted Negative)/(total Negative observation) = TN/TN + FP
• F-measure goes beyond the arithmetic mean and calculates
the harmonic mean of precision and recall:

• Where p denotes precision and r denotes recall.

REGRESSION
• Regression machine learning models output continuous variables, and root-mean-squared error (RMSE)
is the most commonly used evaluation metric for these problems.

RMSE
• RMSE calculates the square root of the sum of the average distance between predicted and actual
values.
• This can also be understood as the average Euclidean distance between the true value and predicted
value vectors.
• A criticism of RMSE is that it is sensitive to outliers.

• where yi denotes the actual value and ˆy i denotes predicted value.

PERCENTILES OF ERRORS

• Percentiles (or quantiles) of error are more robust as a result of being less sensitive to outliers.
• Real-world data is likely to contain outliers, and thus it is often useful to look at the median absolute
percentage error (MAPE) rather than the mean.

• where yi denotes the actual value and ˆy i denotes predicted value.

• The MAPE is less affected by outliers by using the median of the dataset.
• A threshold or percentage difference for predictions can be set for a given problem to give an
understanding of the precision of the regression estimate.
• The threshold depends on the nature of the
• problem.
SKEWED DATASETS, ANOMALIES, AND
RARE DATA

• An experienced data scientist treats all data with suspicion.

• Data can be inconsistent; and as a result, skewed datasets, imbalanced class examples, and outliers can
all significantly affect the performance of a model.
• Having more examples within one class compared to another can lead to an underperforming model.
Furthermore, outliers or data anomalies can further skew performance evaluation metrics.
• The effect of large outliers can be mitigated using percentiles of error.
• In practice, good data cleansing, removal of outliers, and normalization of variables can reduce the
sensitivity to outliers.
PARAMETERS AND HYPERPARAMETERS

• Hyperparameters and parameters are often used interchangeably, yet there is a difference between the
two. Machine learning models can be understood as mathematical models that represent the
relationship between aspects of data.
• Model parameters are properties of the training dataset that are learned and adjusted during training
by the machine learning model.
• Model parameters differ for each model, dataset properties, and the task at hand.
• For instance, in the case of an NLP predictor that output the sophistication of a corpus of text,
parameters such as word frequency, sentence length, and noun or verb distribution per sentence would
be considered model parameters.
• Model hyperparameters are parameters to the model building process that are not learned during
training.
• Hyperparameters can make a substantial difference to the performance of a machine learning model.
• Hyperparameters define the model architecture and effect the capacity of the model, influencing model
flexibility.
• Hyperparameters can also be provided to loss optimization algorithms during the training process.
• Optimal setting of hyperparameters can have a significant effect on predictions and help prevent a
model from overfitting.
• Optimal hyperparameters often differ between datasets and models.
• In the case of a neural network, for example, hyperparameters would include the number and size of
hidden layers, weighting, learning rate, and so forth.
• Decision trees hyperparameters would include the desired depth and number of leaves in the tree.
• Hyperparameters with a support vector machine would include a misclassification penalty term.
TUNING HYPERPARAMETERS

• Hyperparameter tuning or optimization is the task of selecting a set of optimal hyperparameters for a
machine learning model.
• Optimized hyperparameters values maximize a model’s predictive accuracy.
• Hyperparameters are optimized through running training a model, assessing the aggregate accuracy,
and appropriately adjusting the hyperparameters.
• Through trialing a variety of hyperparameter values, the best hyperparameters for the problem are
determined, which improves overall model accuracy.
HYPERPARAMETER TUNING ALGORITHMS

• Hyperparameter tuning is like training a machine learning model.

• The task at hand is one of optimization.
• Model parameters can be expressed as a loss function, whereas hyperparameters cannot be expressed
as such, as it depends entirely on the model training process.
• There are several approaches to hyperparameter tuning, with the most common being grid search and
random search.
GRID SEARCH

• The grid search is a simple, effective, yet resource expensive hyperparameter optimization technique
that evaluates a grid of hyperparameters.
• The method evaluates each hyperparameter and determines the winner.
• For example, if the hyperparameter were the number of leaves in a decision tree, which could be
anywhere from n = 2 to 100, grid search would evaluate each value of n (i.e., points on the grid) to
determine the most effective hyperparameter.
• It is often a case of guessing where to start with hyperparameters, including minimum and maximum
values. The approach is typical of trial and error, whereby if the optimal value lies toward either
maximum or minimum, the grid would be expanded in the appropriate direction in an attempt to
further optimize the model’s hyperparameters.
RANDOM SEARCH
• Random search is a variant of grid search that evaluates a random sample of grid points.
• Computationally, this is far less expensive than a standard grid search.
• Although at first glance it would appear that this is not as useful in finding optimal hyperparameters,
Bergstra et al. demonstrated that in a surprising number of instances, a random search performed
roughly as well as grid search.[65]
• The simplicity and better-than-expected performance of a random search means that it is often chosen
over grid search.
• Both grid search and random search are parallelizable.
• More intelligent hyperparameter tuning algorithms are available that are computationally expensive as
the result of evaluating which samples to try next.
• These algorithms often have hyperparameters of their own.
• Bayesian optimization, random forest smart tuning, and derivative-free optimization are three
examples of such algorithms.
MULTIVARIATE TESTING

• Multivariate testing is an extremely useful method of determining which model is best for the particular
problem at hand.
• Multivariate testing is known as statistical hypothesis testing and determines the difference between a
null hypothesis and alternative hypothesis.
• The null hypothesis is defined as the new model not affecting the average value of the performance
metric; whereas the alternate hypothesis is that the new model does change the average value of the
performance metric.
• Multivariate testing compares similar models to understand which is performing best or compares a
new model against an older, legacy model.
• The respective performance metrics are compared, and a decision is made on which model to proceed
with
• The process of testing is as follows:
1. Split the population into randomized control and experimentation groups.
2. Record the behavior of the populations on the proposed hypotheses.
3. Compute the performance metrics and associated p-values.
4. Decide on which model to proceed with.
• Although the process seems relatively simple, there are a few key aspects for consideration.
WHICH METRIC SHOULD I USE FOR EVALUATION?

• Choosing the appropriate metric to evaluate your model depends on the use case.
• Consider the impact of false positives, false negatives, and the consequences of such predictions.
Furthermore, if a model is attempting to predict an event that only happens 0.001% of the time, an
accuracy of 99.999% can be reported but not confirmed. Build the model to cater to the appropriate
metrics.
• One approach is to repeat the experiment, thus performing repeat evaluations.
• Although not a fail-safe, this reduces the change of illusionary results.
• If there is indeed change between the null and alternate hypothesis, the difference will be confirmed
CORRELATION DOES NOT EQUAL CAUSATION

• The phrase correlation does not equal causation is used to stress that a correlation between two
variables does not suggest that one causes the other.
• Correlation refers to the size and direction of a relationship between two or more variables.
• Causation, also known as cause and effect, emphasizes that the occurrence of one event is related to
the presence of another event.
• It may be tempting to assume that one variable causes the other; however, in models with several
features, there may be hidden factors that cause both variables to move in tandem.
• For instance, smoking tobacco is a cause that increases the risk of developing a variety of cancers.
• However, it may be correlated with alcoholism, but it does not cause alcoholism.
WHAT AMOUNT OF CHANGE COUNTS AS
REAL CHANGE?

• Defining the amount of change required before the null hypothesis

• is rejected once again depends on the use case. Specify a value at the
• beginning of the project that would be satisfactory and adhere to it.
TYPES OF TESTS, STATISTICAL POWER, AND EFFECT
SIZE
• There are two main types of tests—one-tailed and two-tailed tests. One-tailed tests evaluate whether
the new model is better than the original.
• However, it does not specify whether the model is worse than the baseline.
• One-tailed tests are thus inherently biased. With two-tailed tests, the model is tested for the possibility
of change in two directions—positive and negative.
• Statistical power refers to the probability that the difference detected during the testing reflects a
real-world difference.
• Effect size determines the difference between two groups through evaluating the standardized mean
difference between two sets.
• Effect size is calculated as the following:
• Effect size = ((mean of experiment group) – (mean of control group))/standard deviation
CHECKING THE DISTRIBUTION OF YOUR METRIC

• Many multivariate tests use the t-test to analyze the statistical difference between means.
• The t value evaluates the size of the difference relative to the variation in your sample data.
• However, the t-test makes assumptions that are not necessarily satisfied by all metrics. For instance,
the t-test assumes both sets have a normal, or Gaussian, distribution.
• If the distribution does not appear to be Gaussian, select a nonparametric test that does not make
assumptions about a Gaussian distribution, such as the Wilcoxon–Mann–Whitney test.
DETERMINING THE APPROPRIATE P VALUE

• Statistically speaking, the p value is a calculation used in hypothesis testing that represents the strength of the
evidence.
• The p value measures the statistical significance, or probability, that a difference would arise by chance given
there was no real difference between two populations.
• It provides the evidence against the null hypothesis and is a useful metric for stakeholders to draw conclusions
from.
• A p value lies between 0 and 1, and is interpreted as follows:[66]
• a p value of ≤ 0.05 indicates strong evidence against the null hypothesis, thus rejecting the null hypothesis
• a p value of > 0.05 indicates weak evidence against the null hypothesis, hence maintaining the null
hypothesis
• a p value near 0.05 is considered marginal and could swing either way
• The smaller the p value, the smaller the probability that the results are down to chance.
HOW MANY OBSERVATIONS ARE REQUIRED?

• The quantity of observations required is determined by the statistical power demanded by the project.
Ideally, this should be determined at the beginning of the project.
HOW LONG TO RUN A MULTIVARIATE TEST?

• The duration of time required for your multivariate testing is ideally the amount of time required to
capture enough observations to meet the defined statistical power.
• It is often useful to run tests over time to capture a representative, variable sample.
• When determining the duration of your testing phase, consider the novelty effect, which describes how
user reactions in the short term are not representative of the long-term reactions.
• For instance, whenever Facebook updates their news feed layout or design, there is an uproar However,
this soon subsides once the novelty effect has worn off.
• Therefore, it is useful to run your experiment for long enough to overcome this bias.
• Running multivariate tests for long periods of time are typically not a problem in model optimization.
DATA VARIANCE

• The control and experimentation sets could be biased as the result of not being split at random.
• This may result in biases in the sample data.
• If this is the case, other tests can be used, such as Welch’s t-test, which does not assume equal variance
SPOTTING DISTRIBUTION DRIFT

• It is key to measure ongoing performance of your machine learning model once deployed.
• Data drifts and system development require the model to be confirmed against the baseline.
• Typically, this involves monitoring the offline performance, or validation metric, against data from the
live, deployed model.
• If there is a sizeable change in the validation metric, this highlights the need to revise the model
through training on new data.
• This can be done manually or automated to ensure consistent reporting and confidence in the model.
KEEP A NOTE OF MODEL CHANGES

• Keep a log of all changes to your machine learning model with notes on changes.
• Not only does this serve as a change log for stakeholders, it provides a physical record of how the
system has changed over time.
• The use of versioning software within a development enviornment (test/staging to live deployment) will
enable software changes to automatically be noted.
• Versioning software provides a form of technical governance and can be used to deploy software with
extensive rollback and backup facilities.
ETHICS OF AIML IN HEALTHCARE: PRINCIPLES AND
PRACTICES
• 1. Core Ethical Principles in Healthcare AIML
• 1.1 Patient-Centric Care
• Primacy of patient welfare
• Protection of patient autonomy
• Informed consent in AI-assisted decisions
• Balance between automation and human touch
• 1.2 Medical Ethics Integration
• Hippocratic Oath principles in AI systems
• Non-maleficence ("First, do no harm")
• Beneficence (promoting patient well-being)
• Justice in healthcare delivery
• Respect for patient autonomy
• 2. Specific Ethical Challenges
• 2.1 Data Privacy and Security
• Protected Health Information (PHI) handling
• HIPAA compliance in AI systems
• Cross-border data sharing
• Data retention and deletion policies
• Security measures against breaches
• 2.2 Algorithmic Bias and Fairness
• Representative training data
• Demographic bias identification
• Health disparities mitigation
• Equal access to AI-enhanced care
• Cultural competency in AI systems
• 2.3 Transparency and Explainability
• Understanding AI diagnostic recommendations
• Clear communication of AI limitations
• Right to explanation for patients
• Documentation of AI decision processes
• Auditability of AI systems
• 3. Clinical Implementation Considerations
• 3.1 Clinical Validation
• Rigorous testing protocols
• Real-world performance monitoring
• Comparison with standard care
• Population-specific validation
• Continuous performance evaluation
• 3.2 Integration with Clinical Workflow
• Healthcare provider training
• Human oversight mechanisms
• Emergency override procedures
• Integration with existing systems
• Documentation requirements
• 3.3 Quality Assurance
• Regular system audits
• Performance metrics tracking
• Error reporting mechanisms
• Update and maintenance protocols
• Safety monitoring systems
• 4. Stakeholder Responsibilities
• 4.1 Healthcare Providers
• Understanding AI capabilities and limitations
• Maintaining clinical judgment
• Proper communication with patients
• Documentation of AI use
• Continuing education on AI systems
• 4.2 Healthcare Organizations
• Ethical guidelines development
• Staff training programs
• Risk management protocols
• Quality assurance systems
• Patient education initiatives
• 4.3 AI Developers
• Clinical collaboration
• Ethical design principles
• Transparent development
• Regular updates and maintenance
• Response to feedback
• 5. Regulatory and Legal Considerations
• 5.1 Compliance Requirements
• FDA regulations
• HIPAA compliance
• International standards
• State-specific requirements
• Industry best practices
• 5.2 Liability and Responsibility
• Error attribution
• Malpractice considerations
• Documentation requirements
• Insurance implications
• Risk management
• 6. Specific Use Case Ethics
• 6.1 Diagnostic Systems
• Accuracy requirements
• False positive/negative management
• Integration with clinical judgment
• Patient communication
• Result verification protocols
• 6.2 Treatment Planning
• Personalization vs. standardization
• Cost-effectiveness considerations
• Patient preference integration
• Alternative options presentation
• Outcome monitoring
• 6.3 Predictive Analytics
• Risk communication
• Preventive interventions
• Patient autonomy
• Resource allocation
• Follow-up protocols
• 7. Future Considerations
• 7.1 Emerging Technologies
• Integration of new AI capabilities
• Adaptation of ethical frameworks
• Evolution of standards
• Novel use cases
• Technological limitations
• 7.2 Policy Development
• Regulatory updates
• International harmonization
• Industry standards
• Professional guidelines
• Public policy recommendations
• 8. Best Practices Recommendations
• 8.1 Implementation Guidelines
• Phased deployment approach
• Stakeholder engagement
• Training requirements
• Monitoring systems
• Review processes
• 8.2 Ethical Safeguards
• Ethics committee oversight
• Regular audits
• Patient feedback mechanisms
• Incident reporting
• Continuous improvement
• 9. Conclusion
• The ethical implementation of AIML in healthcare requires careful balance between innovation and safety, with
constant attention to patient welfare, privacy, fairness, and transparency. Success depends on collaborative
effort between healthcare providers, organizations, developers, and regulators.

Classification
100% (2)
Classification
105 pages
ML Hand Written Notes
No ratings yet
ML Hand Written Notes
19 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Unit 2
No ratings yet
Unit 2
36 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
No ratings yet
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
127 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
DTS 101 Lecture 2
No ratings yet
DTS 101 Lecture 2
30 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
33 pages
Mgt600 Unit 3 Intellipath
No ratings yet
Mgt600 Unit 3 Intellipath
83 pages
DrSoomro - 2588 - 20292 - 1 - Lecture 7 & 8
No ratings yet
DrSoomro - 2588 - 20292 - 1 - Lecture 7 & 8
60 pages
MLT Notes
No ratings yet
MLT Notes
28 pages
04 Machine Learning Overview
No ratings yet
04 Machine Learning Overview
109 pages
ML Unit IV
No ratings yet
ML Unit IV
70 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Effects of Depression On Students' Academic Performance
88% (8)
Effects of Depression On Students' Academic Performance
6 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
dbms-10 Marks
No ratings yet
dbms-10 Marks
32 pages
Unit3-Data Science
No ratings yet
Unit3-Data Science
37 pages
Prist Deemed To Be University: Study On Product Quality of Sri Sai Leather Crafts Pvt. LTD
No ratings yet
Prist Deemed To Be University: Study On Product Quality of Sri Sai Leather Crafts Pvt. LTD
48 pages
5.1. Testing Hypothesis
No ratings yet
5.1. Testing Hypothesis
34 pages
EdMaestro April Workshops
No ratings yet
EdMaestro April Workshops
3 pages
3ML.02.MainConcepts Evaluation
No ratings yet
3ML.02.MainConcepts Evaluation
35 pages
Multiple Regression Analysis: 2. Statistical Inference (Hypothesis Testing)
No ratings yet
Multiple Regression Analysis: 2. Statistical Inference (Hypothesis Testing)
61 pages
Neuropsychologia: Jakub Sowi Ński, Simone Dalla Bella
No ratings yet
Neuropsychologia: Jakub Sowi Ński, Simone Dalla Bella
12 pages
Noida Institute of Engineering and Technology
No ratings yet
Noida Institute of Engineering and Technology
24 pages
TE - DWM Module No 3
No ratings yet
TE - DWM Module No 3
48 pages
FML - KNN
No ratings yet
FML - KNN
64 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Untitled
No ratings yet
Untitled
11 pages
How To Evaluate Machine Learning Models - Yulinda Rizky
No ratings yet
How To Evaluate Machine Learning Models - Yulinda Rizky
15 pages
STD X - Model Evaluation - Content
No ratings yet
STD X - Model Evaluation - Content
5 pages
Finals Booklet Econ1005 2
No ratings yet
Finals Booklet Econ1005 2
20 pages
MACHINELEARNING
No ratings yet
MACHINELEARNING
20 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Machine Learning Note
No ratings yet
Machine Learning Note
40 pages
Week 2: Machine Learning Intro: Instructor: Ting Sun
No ratings yet
Week 2: Machine Learning Intro: Instructor: Ting Sun
21 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
Collocations
No ratings yet
Collocations
30 pages
Lecture 20 - Evaluation Metrics
No ratings yet
Lecture 20 - Evaluation Metrics
27 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
T Intervals For Two Independent Samples
No ratings yet
T Intervals For Two Independent Samples
2 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Model Evaluation in ML
No ratings yet
Model Evaluation in ML
12 pages
Lecture 3 1611410001002
No ratings yet
Lecture 3 1611410001002
51 pages
Lec 8
No ratings yet
Lec 8
35 pages
Day 7 Problems
No ratings yet
Day 7 Problems
7 pages
Homework 0
No ratings yet
Homework 0
5 pages
Ai Unit 5
No ratings yet
Ai Unit 5
13 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Biostatistics Kmu Final 2020 With Key
67% (6)
Biostatistics Kmu Final 2020 With Key
7 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Going Beyond Simple Sample Size Calculations: A Practitioner's Guide
No ratings yet
Going Beyond Simple Sample Size Calculations: A Practitioner's Guide
56 pages
Clase10 11
No ratings yet
Clase10 11
18 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
PR 2 Exam 1st - 050459
No ratings yet
PR 2 Exam 1st - 050459
8 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
Session 1 Evaluation Model
No ratings yet
Session 1 Evaluation Model
58 pages
Machine Learning Lecture1 - 26-27 Aug
No ratings yet
Machine Learning Lecture1 - 26-27 Aug
30 pages
Annova
No ratings yet
Annova
26 pages
DM Unit - 3
No ratings yet
DM Unit - 3
21 pages
Mangaldan National High School Mangaldan, Pangasinan Budgeted Lesson in Statistics & Probability
100% (1)
Mangaldan National High School Mangaldan, Pangasinan Budgeted Lesson in Statistics & Probability
14 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Kajal Tyagi Project
No ratings yet
Kajal Tyagi Project
73 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
Kang (2021)
No ratings yet
Kang (2021)
12 pages
Research Methodology
100% (1)
Research Methodology
32 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
DLL Stat 4th Week 1day 1
No ratings yet
DLL Stat 4th Week 1day 1
8 pages
Project Time Series Forecasting ROSE Dataset by Somya Dhar 1 PDF
No ratings yet
Project Time Series Forecasting ROSE Dataset by Somya Dhar 1 PDF
52 pages
Darren Biostatistics Past MCQ Compilation
No ratings yet
Darren Biostatistics Past MCQ Compilation
19 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
94 pages
PSYCH STATS OLD EXAMS, Provided For Self-Learning
100% (1)
PSYCH STATS OLD EXAMS, Provided For Self-Learning
24 pages
Marketing Project
No ratings yet
Marketing Project
28 pages
Comparing Sample Proportion and Population Proportion
No ratings yet
Comparing Sample Proportion and Population Proportion
16 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

AIML-HC Mod 03

Uploaded by

AIML-HC Mod 03

Uploaded by

UNIT III

• There are a plethora of evaluation metrics for machine learning problems.

• Classification problems seek to give a label or classification to an input.

• Accuracy is the simplest technique used in identifying whether a model

• Where p denotes precision and r denotes recall.

• where yi denotes the actual value and ˆy i denotes predicted value.

• where yi denotes the actual value and ˆy i denotes predicted value.

• An experienced data scientist treats all data with suspicion.

• Hyperparameter tuning is like training a machine learning model.

• Defining the amount of change required before the null hypothesis

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.