Sumati
Sumati
Tokenization:
Breaking text into words or sentences. Example: "I love NLP" → ["I", "love", "NLP"]
Stemming:
Reduces words to their root form by chopping off prefixes or suffixes. Example: "running",
"runs", "ran" → "run"
Lemmatization:
Converts words to their dictionary form. Example: "better" → "good" (Lemmatization considers
meaning, unlike stemming.)
Text Vectorization:
Converting text into numerical format so that a computer can process it. Example: "I love NLP" →
[1, 0, 2, 3] (using word indexing)
import nltk
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import PorterStemmer
from nltk.corpus import wordnet
from nltk.stem import WordNetLemmatizer
from sklearn.feature_extraction.text import CountVectorizer
# Sample text
text = "I love NLP. NLP is amazing! Running, runs, and ran are
different forms of run."
#
2️⃣ Stemming (Reducing words to their root form)
stemmer = PorterStemmer()
stemmed_words = [stemmer.stem(word) for word in word_tokens]
🔹 Stemming: ['i', 'love', 'nlp', '.', 'nlp', 'is', 'amaz', '!', 'run',
',', 'run', ',', 'and', 'ran', 'are', 'differ', 'form', 'of', 'run',
'.']
Explanation
Text Vectorization: Uses CountVectorizer from Scikit-learn to convert words into numerical
representation.
24-3-2025
import nltk
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import PorterStemmer
from nltk.corpus import wordnet
from nltk.stem import WordNetLemmatizer
from sklearn.feature_extraction.text import CountVectorizer
# Sample text
text = "I love NLP. NLP is amazing! Running, runs, and ran are
different forms of run."
#
1️⃣ Tokenization (Breaking text into words and sentences)
🔹 Stemming: ['i', 'love', 'nlp', '.', 'nlp', 'is', 'amaz', '!', 'run',
',', 'run', ',', 'and', 'ran', 'are', 'differ', 'form', 'of', 'run',
'.']
Explanation
Stemming: Uses PorterStemmer to convert words to their base form (e.g., "running" → "run").
Text Vectorization: Uses CountVectorizer from Scikit-learn to convert words into numerical
representation.
A CNN is a type of deep learning model that can recognize patterns in images.
It is used in image classification, such as identifying whether a picture contains a cat or a dog.
2 Recognize Patterns (e.g., Understand object parts like eyes, ears, and tails)
# Filter the dataset to include only cats (class 3) and dogs (class 5)
def filter_cats_dogs(images, labels):
cat_dog_indices = np.where((labels == 3) | (labels == 5))[0] #
Cats = 3, Dogs = 5
filtered_images = images[cat_dog_indices]
filtered_labels = labels[cat_dog_indices]
# Convert labels to binary: cat = 0, dog = 1
filtered_labels = np.where(filtered_labels == 3, 0, 1)
return filtered_images, filtered_labels
plt.imshow(img, cmap=plt.cm.binary)
# Make predictions
predictions = model.predict(test_images)
/usr/local/lib/python3.11/dist-packages/keras/src/layers/
convolutional/base_conv.py:107: UserWarning: Do not pass an
`input_shape`/`input_dim` argument to a layer. When using Sequential
models, prefer using an `Input(shape)` object as the first layer in
the model instead.
super().__init__(activity_regularizer=activity_regularizer,
**kwargs)
Dataset Filtering:
The CIFAR-10 dataset contains 10 classes. We filter it to include only cats (class 3) and dogs
(class 5).
Model Output:
The output layer uses sigmoid activation for binary classification (cat or dog).
Visualization:
The plot_image function displays the image along with the predicted and true labels.
If the prediction is correct, the label is displayed in blue; otherwise, it's displayed in red.
What is AI Ethics?
Artificial Intelligence (AI) is a powerful technology used in robots, self-driving cars, chatbots, and
facial recognition systems. However, AI must be used fairly, safely, and responsibly. AI Ethics
ensures that AI systems:
What is Bias?
Bias happens when AI makes unfair decisions because of incorrect, incomplete, or unbalanced
data.
Example of Bias in AI
1 Facial Recognition Bias – AI may recognize lighter-skinned faces more accurately than darker-
skinned faces if it was trained on mostly lighter-skinned images.
Job Hiring Bias – If an AI system is trained mostly on male candidates' resumes, it might favor
men over women for job selections.
AI learns from data – If the data is not diverse, the AI can be biased.
2.AI models reflect human choices – If humans create biased rules, AI will follow them.
AI can amplify discrimination – If AI is not tested properly, it might make unfair decisions.
To ensure fair and responsible AI, developers must follow AI ethics principles:
Example: AI should not give loans only to rich people but check all applicants fairly.
Avoiding Harm
1.Use diverse training data – Ensure AI learns from all types of people and situations.
AI's Decision: [0 1]
EXPLAINATION
In the Python code for AI bias detection, we trained a simple logistic regression model to decide
whether a candidate should be hired based on income levels.
We provided the model with training data where higher-income individuals were more likely to
be hired.
0 (Not Hired) means the AI does not select the person for the job.
END OF UNIT V