0% found this document useful (0 votes)
26 views3 pages

AI Medical Diagnosis Week 02

The document discusses calculating loss values for different predictions in a binary classification model. It defines positive and negative weights based on the distribution of labels in the data. It then calculates loss values when the model predicts 0.9, 0.1, and other constant values. It also calculates the loss for each class individually. The document explores summing loss terms and printing loss values with different axis arguments in NumPy. Finally, it discusses checking for patient overlap between the training and validation datasets.

Uploaded by

San Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

AI Medical Diagnosis Week 02

The document discusses calculating loss values for different predictions in a binary classification model. It defines positive and negative weights based on the distribution of labels in the data. It then calculates loss values when the model predicts 0.9, 0.1, and other constant values. It also calculates the loss for each class individually. The document explores summing loss terms and printing loss values with different axis arguments in NumPy. Finally, it discusses checking for patient overlap between the training and validation datasets.

Uploaded by

San Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

print(f"When the model 1 always predicts 0.

9, the regular # Calculate and print out the second term in the loss func on,
loss is {loss_reg_1:.4f}") which we're calling 'loss_neg'
print(f"When the model 2 always predicts 0.1, the regular loss_2_neg = -1 * np.sum(w_n * (1 - y_true) * np.log(1 -
loss is {loss_reg_2:.4f}") y_pred_2))
# calculate the posi ve weight as the frac on of nega ve print(f"loss_2_neg: {loss_2_neg:.4f}")
labels # Sum posi ve and nega ve losses to calculate total loss when
w_p = 1/4 the predic on is y_pred_2
loss_2 = loss_2_pos + loss_2_neg
# calculate the nega ve weight as the frac on of posi ve print(f"loss_2: {loss_2:.4f}")
labels print(f"When the model always predicts 0.9, the total loss is
w_n = 3/4 {loss_1:.4f}")
print(f"When the model always predicts 0.1, the total loss is
print(f"posi ve weight w_p: {w_p}") {loss_2:.4f}")
print(f"nega ve weight w_n {w_n}") print(f"loss_1_pos: {loss_1_pos:.4f} \t loss_1_neg:
# Calculate and print out the first term in the loss func on, {loss_1_neg:.4f}")
which we are calling 'loss_pos' print()
loss_1_pos = -1 * np.sum(w_p * y_true * np.log(y_pred_1 print(f"loss_2_pos: {loss_2_pos:.4f} \t loss_2_neg:
)) {loss_2_neg:.4f}")
print(f"loss_1_pos: {loss_1_pos:.4f}") # View the labels (true values) that you will prac ce with
# Calculate and print out the second term in the loss y_true = np.array(
func on, which we're calling 'loss_neg' [[1,0],
loss_1_neg = -1 * np.sum(w_n * (1 - y_true) * np.log(1 - [1,0],
y_pred_1 )) [1,0],
print(f"loss_1_neg: {loss_1_neg:.4f}") [1,0],
# Sum posi ve and nega ve losses to calculate total loss [0,1]
loss_1 = loss_1_pos + loss_1_neg ])
print(f"loss_1: {loss_1:.4f}") y_true
# Calculate and print out the first term in the loss func on, # See what happens when you set axis=0
which we are calling 'loss_pos' print(f"using axis = 0 {np.sum(y_true,axis=0)}")
loss_2_pos = -1 * np.sum(w_p * y_true * np.log(y_pred_2))
print(f"loss_2_pos: {loss_2_pos:.4f}") # Compare this to what happens when you set axis=1
print(f"using axis = 1 {np.sum(y_true,axis=1)}")

# set the posi ve weights as the frac on of nega ve labels # calculate the loss from the posi ve predic ons, for class 0
(0) for each class (each column) loss_0_pos = -1 * np.sum(w_p[0] *
w_p = np.sum(y_true == 0,axis=0) / y_true.shape[0] y_true[:, 0] *
w_p np.log(y_pred[:, 0])
# set the nega ve weights as the frac on of posi ve labels )
(1) for each class print(f"loss_0_pos: {loss_0_pos:.4f}")
w_n = np.sum(y_true == 1, axis=0) / y_true.shape[0] # Print and view column zero of the weight
w_n print(f"w_n[0]: {w_n[0]}")
# Set model predic ons where all predic ons are the same print(f"y_true[:,0]: {y_true[:,0]}")
y_pred = np.ones(y_true.shape) print(f"y_pred[:,0]: {y_pred[:,0]}")
y_pred[:,0] = 0.3 * y_pred[:,0] # Calculate the loss from the nega ve predic ons, for class 0
y_pred[:,1] = 0.7 * y_pred[:,1] loss_0_neg = -1 * np.sum(
y_pred w_n[0] *
# Print and view column zero of the weight (1 - y_true[:, 0]) *
print(f"w_p[0]: {w_p[0]}") np.log(1 - y_pred[:, 0])
print(f"y_true[:,0]: {y_true[:,0]}") )
print(f"y_pred[:,0]: {y_pred[:,0]}") print(f"loss_0_neg: {loss_0_neg:.4f}")

# calculate the loss from the posi ve predic ons, for class 1
# add the two loss terms to get the total loss for class 0 loss_1_pos = None
loss_0 = loss_0_neg + loss_0_pos # Calculate the loss from the nega ve predic ons, for class 1
print(f"loss_0: {loss_0:.4f}" loss_1_neg = None
# add the two loss terms to get the total loss for class 0
loss_-- # calculate the loss from the posi ve predic ons, for class 1 loss_1_pos = -1 * np.sum(w_p[1] * y_true[:, 1] *
np.log(y_pred[:, 1]) ) print(f"loss_1_pos: {loss_1_pos:.4f}")
-- # Calculate the loss from the nega ve predic ons, for class 1 loss_1_neg = -1 * np.sum( w_n[1] * (1 - y_true[:, 1]) * np.log(1 -
y_pred[:, 1]) ) print(f"loss_1_neg: {loss_1_neg:.4f}")
-- # add the two loss terms to get the total loss for class 1 loss_1 = loss_1_neg + loss_1_pos print(f"loss_1: {loss_1:.4f}")
1 = None

# Import Densenet from Keras # Define a set of five class labels to use as an example
from keras.applica ons.densenet import DenseNet121 labels = ['Emphysema',
from keras.layers import Dense, GlobalAveragePooling2D 'Hernia',
from keras.models import Model 'Mass',
from keras import backend as K 'Pneumonia',
Using TensorFlow backend. 'Edema']
# Create the base pre-trained model n_classes = len(labels)
base_model = DenseNet121(weights='./nih/densenet.hdf5', print(f"In this example, you want your model to iden fy
include_top=False); {n_classes} classes")
# Print the model summary # Add a logis c layer the same size as the number of
base_model.summary() classes you're trying to predict
# Print out the first five layers predic ons = Dense(n_classes,
layers_l = base_model.layers ac va on="sigmoid")(x_pool)
# Print out the last five layers print(f"Predic ons have {n_classes} units, one for each
print("Last 5 layers") class")
layers_l[-6:-1] predic ons
print("First 5 layers") # Create an updated model
layers_l[0:5] model = Model(inputs=base_model.input,
outputs=predic ons)
# Get the convolu onal layers and print the first 5 # Compile the model
conv2D_layers = [layer for layer in base_model.layers model.compile(op mizer='adam',
if str(type(layer)).find('Conv2D') > -1] loss='categorical_crossentropy')
print("The first five conv2D layers") # (You'll customize the loss func on in the assignment!)
conv2D_layers[0:5]
Pa ent Overlap and Data Leakage
# Print out the total number of convolu onal layers # Import necessary packages
print(f"There are {len(conv2D_layers)} convolu onal layers") import pandas as pd
# Print the number of channels in the input import numpy as np
print("The input has 3 channels") import matplotlib.pyplot as plt
base_model.input %matplotlib inline
# Print the number of output channels import os
print("The output has 1024 channels") import seaborn as sns
x = base_model.output sns.set()
x # Read csv file containing training data
train_df = pd.read_csv("nih/train-small.csv")
# Add a global spa al average pooling layer # Print first 5 rows
print(f'There are {train_df.shape[0]} rows and
x_pool = GlobalAveragePooling2D()(x)
{train_df.shape[1]} columns in the training dataframe')
x_pool train_df.head()

# Read csv file containing valida on data


valid_df = pd.read_csv("nih/valid-small.csv")
# Print first 5 rows
print(f'There are {valid_df.shape[0]} rows and
{valid_df.shape[1]} columns in the valida on dataframe')
valid_df.head()
# Extract pa ent id's for the training set
ids_train = train_df.Pa entId.values
# Extract pa ent id's for the valida on set
ids_valid = valid_df.Pa entId.values
# Create a "set" datastructure of the training set id's to iden fy unique id's
ids_train_set = set(ids_train)
print(f'There are {len(ids_train_set)} unique Pa ent IDs in the training set')
# Create a "set" datastructure of the valida on set id's to iden fy unique id's
ids_valid_set = set(ids_valid)
print(f'There are {len(ids_valid_set)} unique Pa ent IDs in the valida on set')
# Iden fy pa ent overlap by looking at the intersec on between the sets
pa ent_overlap = list(ids_train_set.intersec on(ids_valid_set))
n_overlap = len(pa ent_overlap)
print(f'There are {n_overlap} Pa ent IDs in both the training and valida on sets')
print('')
print(f'These pa ents are in both the training and valida on datasets:')
print(f'{pa ent_overlap}')
train_overlap_idxs = []
valid_overlap_idxs = []
for idx in range(n_overlap):
train_overlap_idxs.extend(train_df.index[train_df['Pa entId'] == pa ent_overlap[idx]].tolist())
valid_overlap_idxs.extend(valid_df.index[valid_df['Pa entId'] == pa ent_overlap[idx]].tolist())

print(f'These are the indices of overlapping pa ents in the training set: ')
print(f'{train_overlap_idxs}')
print(f'These are the indices of overlapping pa ents in the valida on set: ')
print(f'{valid_overlap_idxs}')

# Drop the overlapping rows from the valida on set


valid_df.drop(valid_overlap_idxs, inplace=True)

# Extract pa ent id's for the valida on set


ids_valid = valid_df.Pa entId.values
# Create a "set" datastructure of the valida on set id's to iden fy unique id's
ids_valid_set = set(ids_valid)
print(f'There are {len(ids_valid_set)} unique Pa ent IDs in the valida on set')
# Iden fy pa ent overlap by looking at the intersec on between the sets
pa ent_overlap = list(ids_train_set.intersec on(ids_valid_set))
n_overlap = len(pa ent_overlap)
print(f'There are {n_overlap} Pa ent IDs in both the training and valida on sets')

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy