0% found this document useful (0 votes)
61 views48 pages

Visakha Institute of Engineering&Technology Computer Science and Engineering

The document is an internship report by Alavalapati Sriram, a fourth-year BTech student in Computer Science and Engineering at Visakha Institute of Engineering and Technology, detailing a 16-week internship focused on Artificial Intelligence (AI) and Machine Learning (ML) conducted at EXCELR. It outlines the learning journey through various topics including Python programming, data handling with NumPy and Pandas, data visualization, statistics, and machine learning algorithms, culminating in practical applications and projects. The report emphasizes the development of practical skills and theoretical understanding essential for a career in AI and ML.

Uploaded by

kumarpentakota22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views48 pages

Visakha Institute of Engineering&Technology Computer Science and Engineering

The document is an internship report by Alavalapati Sriram, a fourth-year BTech student in Computer Science and Engineering at Visakha Institute of Engineering and Technology, detailing a 16-week internship focused on Artificial Intelligence (AI) and Machine Learning (ML) conducted at EXCELR. It outlines the learning journey through various topics including Python programming, data handling with NumPy and Pandas, data visualization, statistics, and machine learning algorithms, culminating in practical applications and projects. The report emphasizes the development of practical skills and theoretical understanding essential for a career in AI and ML.

Uploaded by

kumarpentakota22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

VISAKHA INSTITUTE OF ENGINEERING&TECHNOLOGY

COMPUTER SCIENCE AND ENGINEERING

Name:ALAVALAPATISRIRAM
Year: IV Btech
Semester: II
RollNo: 22NT5A0502
Internship: AI & ML
ANINTERNSHIPPROJECTREPORTON

AI & ML

CarriedoutbyEXCELR

(2022-2025)

VISAKHAINSTITUTEOFENGINEERINGANDTECHNOLOGY

(ApprovedbyAICTENewDelhi&RecognizedbyJNTUG,VIZINAGARAM

) 88th Division, NARAVA,GVMCVisakhapatnam-530027


INTERNSHIP16-WEEKSREPORT2025

INTERNSHIP PERIOD DURING 2025


CERTIFICATEOFINTERNSHIP

This is to certify that the “ AI & ML’ ’ submitted by NAME:


ALAVALAPATI . SRIRAM (Regd. No: 22NT5A0502) is work done by her/his

and submitted during 2024 – 2025 for 16 weeks in this academic year, in
partial fulfilment of the requirements for the award of the degree of

BACHELOR OF TECHNOLOGY in COMPUTER SCIENCE AND

ENGINEERING, at EXCELR, APSCHE.

K.VIJAY Dr.ASC.TEJASWINIKONE

DepartmentInternshipCoordinator HeadtheDepartment

DepartmentofCSE DepartmentofCSE

EXTERNALSIGNATURE
CERTIFICATEFROMEXCERLORGANIZATION
DECLARATION

We hereby declare that work entitled “ Internship Program 2025”

submitted towards completion training 4th year of BTech (CSE) At

EXCELR comprises of our original work pursue under the guidance of


DEPARTMENT OF CSE.
ACKNOWLEDGMENT

Aproject is a golden opportunity for learning and self-development. I


consider myself very lucky and Honor to have so many wonderful people,

lead me through,incompleting of this project.

Our grateful thanks to DEPARTMENT OF CSE, in spite of being

extraordinarily busy withtheir duties,tooktime out tohear,guideandkeep

us on the correct path. A humble ‘ Thank you’ .

We would like to thank Mrs. A.S.C Tejaswini Kone Madam Head Of The
Department forall the help rendered. Thank you, Dear Madam we would
like to thank you for your efforts and help provided to me to get such an

excellent opportunity. Last but not the least there were so many who
shared valuable informationthathelped in the successful completion of

this project

ALAVALAPATI . SRIRAM
22NT5A0502
AI & ML 16-WEEKSINTERNSHIP

EXCELR
ABSTRACT

Title: Artificial Intelligence and Machine Learning Internship – Learning, Implementation


& Innovation

Abstract: This internship focused on acquiring practical knowledge and hands-on


experience in the domains of Artificial Intelligence (AI) and Machine Learning (ML). Over
a duration of 16 weeks, I explored key concepts including supervised and unsupervised
learning, data preprocessing, model evaluation, and deep learning techniques using
Python and libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and Keras.

The internship began with foundational programming in Python, followed by statistical


analysis and data visualization for understanding datasets. Machine learning algorithms
such as linear regression, decision trees, random forests, SVM, and KNN were studied
and implemented on real-world datasets. Unsupervised learning models like K-means
clustering and PCA were also explored.

Natural Language Processing (NLP) and Neural Networks were introduced in the later
stages, leading to practical applications such as sentiment analysis and image
classification using Convolutional Neural Networks (CNNs). The program culminated
with a capstone project involving model building, evaluation, and deployment strategies.

This internship significantly enhanced my theoretical understanding and coding skills in


AI/ML and prepared me for real-world challenges in intelligent system development and
data-driven decision making.

Would you like me to tailor this abstract for a specific project, platform (like Eduskills,
IBM, etc.), or add technical details (tools, datasets, etc.)?
ACTIVITY LOG FOR THE 1st WEEK

Week Day Content

Day-1 INTRODUCTION TO PYTHON AND IDE SETUP

Day-2 VARIBLES, DATA TYPES AND OPERATORS

Day-3
CONDITIONAL STATEMENTS AND LOOPS

Week-1
FUNCTIONS AND LAMBDA FUNCTIONS
Day-4

Day-5
OBJECT ORIENTED PROGRAMMING BASICS
1stWEEK REPORT

Week 1 Internship Report: Python Fundamentals

During the first week of my AI & ML internship, I was introduced to the foundational
programming skills required for data science and machine learning — primarily through
Python. The focus was on building a strong base in Python syntax, data types, and
control structures. We began by setting up our development environment using tools
like Jupyter Notebook and VS Code, and then quickly progressed into writing simple
Python programs.

Each day was structured to cover specific core concepts. On Day 1, we explored basic
syntax, variables, and data types such as integers, strings, and lists. Day 2 introduced
conditional statements and loops, which are essential for writing logical programs. By
Day 3, we were working with functions, parameters, and return types, helping us write
more modular and reusable code. Day 4 focused on file handling and exception
management, while Day 5 wrapped up with object-oriented programming (OOP),
covering classes, objects, inheritance, and encapsulation.

One of the most engaging aspects of the week was the hands-on coding sessions. We
were encouraged to practice each concept by solving small problems and writing code
snippets. I particularly enjoyed solving logic-building problems that involved loops and
conditionals, as they sharpened my logical thinking and syntax familiarity. The use of
online coding platforms and GitHub for version control was also introduced, setting a
standard for professional development practices.

Additionally, we were given assignments and quizzes that helped reinforce our learning.
These included building a simple calculator using functions, creating a class-based
student grading system using OOP, and debugging code snippets. We also participated
in a group discussion on how Python’ s simplicity and versatility make it an ideal
language for AI and ML development.

In summary, Week 1 was a productive and exciting start to the internship. It laid the
groundwork for more complex topics by ensuring we are comfortable with Python,
which is essential for future machine learning algorithms and data handling. I am
confident that this foundation will be invaluable as we progress through the internship
in the coming weeks
ACTIVITY LOG FOR THE 2ndWEEK

S.no Week Day Content

1 Day-1 NUMPY BASICS

2 NUMPY INDEXING,SLICING , BROADCASTING


Day-2

3 Day-3 PANDAS DATAFRAMES AND SERIES

Week-2

4 Day-4 DATA CLEANING WITH PANDAS

Day-5
HANDLING MISSING VALUES AND DUPLICATES
5
Week 2 Internship Report: Data Handling with NumPy and Pandas
The second week of the internship focused on data manipulation using two of
Python’ s most powerful libraries: NumPy and Pandas. These tools form the backbone
of data preprocessing in the machine learning pipeline. The week was dedicated to
understanding how to efficiently work with arrays, handle structured data, clean
datasets, and prepare them for analysis or modeling.

The first part of the week introduced NumPy. We learned about the creation and
manipulation of arrays, vectorized operations, array indexing, slicing, reshaping, and
broadcasting. These skills are crucial for performing mathematical computations on
large datasets. Working with multidimensional arrays helped me understand the
performance benefits of using NumPy over traditional Python lists.

In the second half of the week, we moved on to Pandas, which provides high-level data
structures like Series and DataFrames. We were taught how to load data from various
sources (CSV, Excel), explore data, and handle missing values and duplicates. These
lessons were complemented with hands-on exercises involving real-world datasets,
such as cleaning and transforming a sample dataset of customer transactions.

One of the key takeaways from this week was learning how to perform operations such
as filtering, grouping, and aggregating data using Pandas. We also explored common
techniques for handling null values, data imputation, and detecting outliers. Through
assignments and coding challenges, we were able to apply these techniques in
practical scenarios, which enhanced our understanding of real-world data processing.

Overall, Week 2 provided essential skills that form the core of any data science or
machine learning project. Understanding how to structure, clean, and manipulate data
using NumPy and Pandas has given me the confidence to tackle messy datasets and
prepare them for analysis and modeling. I'm looking forward to building on this
knowledge in the coming weeks as we move into data visualization and exploratory
data analysis (EDA).
ACTIVITY LOG FOR THE 3rdWEEK

S.no Week Day Content

1 Day-1 INTRODUCTION TO MAT PLOT LIB

2 PLOT TYPES: LINE,HISTOGRAM


Day-2

Week-3

3 Day-3 SEABORN FOR ADVANCE PLOTS

4 Day-4 PAIR PLOT,HEAT MAP,BOX PLOT

5 Day-5

MINI PROJECT
3rdWEEKREPORT

Week 3 Internship Report: Data Visualization with Matplotlib and Seaborn


Week 3 of the internship focused on one of the most crucial aspects of data science —
data visualization. This week, we were introduced to Matplotlib and Seaborn, two of the
most widely used Python libraries for creating insightful and visually appealing plots.
The primary objective was to learn how to represent data graphically in order to
uncover patterns, trends, and outliers, which are essential for effective data analysis
and storytelling.

The week began with an introduction to Matplotlib, where we explored basic plotting
functions such as line plots, bar graphs, histograms, and scatter plots. We also learned
how to customize plots using titles, labels, legends, colors, and styles to make them
more informative and visually clear. Practical exercises included plotting sales data
over time and analyzing distribution using histograms. This hands-on approach helped
reinforce our understanding of each chart type and its use case.

In the mid-week sessions, we moved on to Seaborn — a higher-level interface built on


top of Matplotlib that makes it easier to create complex visualizations with less code.
We learned about Seaborn’ s built-in themes, color palettes, and functions like
sns.pairplot(), sns.heatmap(), sns.boxplot(), and sns.countplot(). These tools allowed
us to perform multi-variable visualizations and correlation analyses. We also practiced
visualizing categorical vs. numerical data, an essential part of EDA.

On the final day of the week, we applied what we learned in a mini-project involving
exploratory data analysis (EDA). We were given a real-world dataset and asked to
generate visual insights about customer behavior. Using a combination of Matplotlib
and Seaborn, we identified key trends, correlations, and outliers in the data. This project
emphasized how visualizations can support hypothesis generation and guide further
analysis or model-building.

In summary, Week 3 was both insightful and creative. It taught me how to turn raw
numbers into visual stories that can drive decisions. Mastering data visualization has
not only improved my ability to analyze data but also enhanced how I communicate
findings with others. As we move into statistics and probability in Week 4, I feel
well-prepared to interpret and present data in a clear and meaningful way.
ACTIVITYLOGFORTHE4thWEEK

S.no Week Day Content

1 Day-1 DISCRIPTIVE STASTICS

2 PROBABILITY BASICS AND


Day-2 DISTRIBUTIONS

3 Day-3 BAYES THEOREM AND CONDITIONAL


PROBABILITY
Week-4

4 Day-4 STANDARD DEIVATION,


VARIANCE,Z-SCORE

Day-5
5 HANDS ON WITH REAL DATA STATS
4thWEEKREPORT

Week 4 Internship Report: Statistics & Probability for Machine Learning


Week 4 of the internship shifted the focus from coding to the theoretical backbone of
data science — Statistics and Probability. These concepts are vital for understanding
how data behaves and for building robust machine learning models. The week was
structured to help us build intuition around descriptive statistics, probability
distributions, and foundational concepts like Bayes’ Theorem, which are essential for
predictive modeling.

The week started with descriptive statistics, where we explored measures of central
tendency (mean, median, and mode) and dispersion (range, variance, standard
deviation). We applied these concepts using Python libraries such as NumPy and
Pandas to analyze datasets and summarize key characteristics. This provided a clearer
understanding of how data can be quantitatively described, which is crucial for
interpreting patterns before model development.

Mid-week, we delved into the fundamentals of probability, including independent and


dependent events, conditional probability, and joint probability. We learned how to use
probability rules and Venn diagrams to solve real-world problems. One of the highlights
was learning about Bayes’ Theorem — a critical concept in machine learning used for
classification problems, such as spam detection and medical diagnosis.

We also explored probability distributions such as normal distribution, binomial


distribution, and uniform distribution. Understanding the shape and behavior of these
distributions helps in selecting appropriate models and testing assumptions in ML
algorithms. Visualization tools like Seaborn were used to plot distributions and gain a
more intuitive grasp of data distribution.

To wrap up the week, we completed hands-on exercises and a mini case study where
we analyzed a dataset using statistical measures and visualizations. This practical
application solidified our understanding and demonstrated how statistics and
probability serve as a foundation for machine learning algorithms. Overall, Week 4 was
highly informative and prepared us to better understand model evaluation and
prediction in the upcoming weeks.
ACTIVITY LOG WEEK 5

Week Day Content

WHAT IS ML? TYPES OF ML


1 Day-1

2
Day-2 ML LIFE CYCLE AND TOOLS

Week-5
3 Day-3 TRAIN-TEST SPLIT AND SCIKIT LEARN

4 Day-4 EVALUATION
MATRICES:ACCURACY,PRESSION

5 Day-5
LINEAR REGRESSION-THEORY
5 th WEEK REPORT

Week 5 Internship Report: Introduction to Machine Learning


Week 5 marked a major milestone in our AI & ML internship as we transitioned from
foundational tools and theory into the core topic: Machine Learning (ML). This week
served as an introduction to the field, providing a high-level overview of different types
of ML algorithms, the ML workflow, and how data is used to train predictive models.
The sessions were a mix of conceptual lectures and practical implementation using
Python's scikit-learn library.

We began the week by understanding what machine learning is and how it differs from
traditional programming. We explored the three main types of ML: supervised learning,
unsupervised learning, and reinforcement learning, with a focus on supervised learning
for now. The concept of a model “ learning” from data — by recognizing patterns
without being explicitly programmed — was a key point of discussion. We also covered
real-world applications like fraud detection, recommendation systems, and speech
recognition.

Mid-week, we learned about the machine learning lifecycle, including steps such as
data collection, data preprocessing, model selection, training, evaluation, and
deployment. We were introduced to essential tools such as Jupyter Notebook and the
scikit-learn library, which makes building ML models in Python more accessible. We
practiced using train-test split methods to evaluate model performance and learned
why data partitioning is important to prevent overfitting.

The week concluded with an in-depth look at the first two algorithms in supervised
learning: Linear Regression and Logistic Regression. Linear Regression was used for
predicting continuous values, such as house prices, while Logistic Regression was
used for classification tasks like predicting whether a customer will churn. We
implemented both models using real-world datasets and evaluated their performance
using metrics like accuracy, mean squared error, and confusion matrices.

Overall, Week 5 provided a strong foundation in understanding what machine learning


is and how models are trained and evaluated. It bridged the gap between theoretical
knowledge and practical application, giving us the tools to start building and testing
simple models. I am excited to dive deeper into more advanced algorithms and model
optimization techniques in the weeks
ACTIVITY LOG WEEK 6

S.n Week Day Content


o

LINEAR REGRESSION WITH CODE


Day-1

2 POLYNOMIAL REGRESSION
Day-2

LOGISTIC REGRESSION THEORY


Day-3
Week-6
3

4 Day-4 LOGSTICS REGRESSION PRACTICAL

5 Day-5
MINI PROJECT
6 th WEEK REPORT

Week 6 Internship Report: Regression Models – Linear, Polynomial, and Logistic


Regression
In Week 6 of the AI & ML internship, we focused on understanding and implementing
regression techniques — one of the core components of supervised machine
learning. The week revolved around Linear Regression, Polynomial Regression, and
Logistic Regression, helping us grasp the fundamental concepts behind making
predictions using real-world datasets. We not only learned the mathematical
foundations of these models but also applied them using Python and scikit-learn.

We began the week by revisiting Linear Regression, where we studied how to model
the relationship between a dependent variable and one or more independent
variables. Through hands-on exercises, we implemented simple and multiple linear
regression models. We evaluated their performance using metrics such as Mean
Squared Error (MSE), R-squared, and Residual Plots, which gave us a practical
understanding of model accuracy and goodness-of-fit.

As the week progressed, we moved on to Polynomial Regression — an extension of


linear regression that allows us to capture non-linear relationships. By increasing the
degree of the polynomial features, we saw how models could better fit complex
datasets, though we also discussed the trade-offs, particularly overfitting. Using
visualizations, we compared linear vs. polynomial regression fits to see which
performed better depending on the scenario.

Next, we studied Logistic Regression, which, despite its name, is used for
classification tasks. We explored the concept of the sigmoid function and how it
maps predicted values to probabilities. We then applied logistic regression to binary
classification problems such as predicting whether a student will pass or fail based
on study hours. We also introduced evaluation metrics like confusion matrices,
ACTIVITY LOG WEEK 7

DECISION TREE CLASSIFIER


1 Day-1

2 OVER FITTING & PRUNING


Day-2
Week-7

3 Day-3 RANDOM FOREST INTRO

RANDOM FOREST HYPER PARAMETER


4 Day-4 PRUNING

5 Day-5
MINI PROJECT
7 th WEEKREPORT
Week 7 Internship Report: Decision Trees & Random Forest
Week 7 of the internship introduced us to tree-based models, particularly Decision
Trees and Random Forest, which are powerful and intuitive algorithms used in both
classification and regression tasks. This week was especially interesting because it
bridged the gap between simple linear models and more complex ensemble methods,
showing us how machine learning can capture non-linear patterns and make decisions
in a structured, hierarchical way.

The week began with an in-depth explanation of Decision Trees. We explored how they
split data based on features using criteria such as Gini Impurity and Entropy (for
classification) or Mean Squared Error (for regression). Through visualizations and
coding exercises, we saw how the tree “ learns” to split the data recursively, forming
a structure that mimics human decision-making. We also discussed overfitting and
how an unpruned tree can perform poorly on unseen data.

To address overfitting, we were introduced to hyperparameters such as max_depth,


min_samples_split, and min_samples_leaf, which help control the complexity of the tree.
We practiced tuning these parameters using grid search to improve model
generalization. We also used visualization tools like graphviz and sklearn’ s plot_tree
to view the structure of trained models and understand their decision paths more
clearly.

In the latter part of the week, we learned about Random Forest — an ensemble method
that builds multiple decision trees and combines their outputs for better accuracy and
robustness. We studied how bagging and random feature selection during training
helps reduce overfitting and improve performance. We implemented classification and
regression tasks using Random Forest and compared the results with single Decision
Trees to see the improvements in accuracy and stability.

To wrap up, we worked on a mini-project that involved predicting customer churn using
tree-based models. We handled feature selection, model training, tuning, and evaluation
using confusion matrices and classification reports. Overall, Week 7 gave me a deeper
appreciation for how machine learning models can be both interpretable and powerful.
With a better grasp of tree-based algorithms, I feel more equipped to build models for
real-world decision-making tasks
ACTIVITY LOG WEEK 8

S.n Week Day Content


o

KNN THEORY
1 Day-1

2
Day-2 KNN WITH SKYKIT LEARNING

Week-8
3 Day-3 SVM

SVM WITH KERNAL


4 Day-4

5 Day-5
HANDS ON PRACTICE UCI DATA SET
8 th WEEK REPORT
Week 8 Internship Report: Support Vector Machines (SVM)
Week 8 of the AI & ML internship introduced us to one of the most powerful
supervised learning algorithms — Support Vector Machines (SVM). This algorithm
is widely used for classification problems and is known for its ability to handle both
linear and non-linear data. Throughout the week, we explored the theory behind SVM,
its mathematical foundations, and practical implementation using scikit-learn.

We began with the basic concepts of SVM: hyperplanes, support vectors, and
margin maximization. Using visual examples, we learned how SVM tries to find the
optimal boundary (hyperplane) that separates different classes with the maximum
margin. We implemented linear SVM models and applied them to datasets like the
Iris dataset, using different kernel functions such as linear, polynomial, and radial
basis function (RBF).

A key learning point this week was understanding the role of kernel functions in
handling non-linearly separable data. We experimented with different kernels and
saw how they transformed the feature space to make classification possible. We
also explored important hyperparameters such as C (regularization) and gamma (for
non-linear kernels), and how tuning these values affects model complexity and
performance.

Mid-week, we used visualization tools like decision boundaries and support vector
plots to interpret the model's decision-making process. This helped in
understanding how certain data points (support vectors) influence the position of
the hyperplane. We also covered multi-class classification using SVMs with a one-vs-
rest approach, which broadened our understanding of how SVMs can be applied
beyond binary tasks.

To conclude the week, we worked on a project to classify handwritten digits using


the MNIST dataset. This task required data preprocessing, training an SVM model
with appropriate kernels, hyperparameter tuning using GridSearchCV, and evaluating
the model using accuracy, precision, and confusion matrices. Overall, Week 8 was
intellectually stimulating and gave me a solid grasp of SVM, preparing me to handle
more advanced machine learning techniques in the upcoming weeks.
ACTIVITY LOG WEEK 9

S.n Week Day Content


o

CLUSTERING OVERVIEW
1 Day-1

2
Day-2 K-MEANS CLUSTERING
Week-9

3 Day-3 HIERARCHICAL CLUSTERING

4 Day-4 PCA

5 Day-5
MINI PROOJECT
Week 9 Internship Report: Unsupervised Learning – Clustering (K-Means &
Hierarchical)
Week 9 of the AI & ML internship marked a shift in focus from supervised to
unsupervised learning, particularly clustering techniques. Clustering allows us to
discover patterns or groupings in datasets without predefined labels. This week, we
explored K-Means Clustering and Hierarchical Clustering, two of the most commonly
used methods for segmenting data. These techniques are essential in areas such as
customer segmentation, market analysis, and anomaly detection.

We began with the fundamentals of K-Means Clustering. Through theory and


visualization, we understood how the algorithm groups data into k clusters by
minimizing intra-cluster distance and maximizing inter-cluster separation. Using the
scikit-learn library, we implemented K-Means on datasets like customer spending
data and social network usage. We also learned how to determine the optimal
number of clusters using the Elbow Method and Silhouette Score.

Mid-week, we explored the limitations of K-Means, such as sensitivity to initial


centroid positions and difficulty handling non-spherical clusters. This led us to
Hierarchical Clustering, which builds a tree-like structure (dendrogram) to represent
nested clusters. We practiced using both agglomerative and divisive approaches
and used scipy and seaborn libraries for visualizing the clustering hierarchy. This
method allowed us to analyze the relationships between data points at different
levels of granularity.

One highlight was applying clustering to real-world datasets — such as grouping


similar customers based on behavioral data and visualizing those clusters with PCA
(Principal Component Analysis) for dimensionality reduction. This hands-on
experience reinforced the importance of unsupervised learning when labels are
unavailable, and interpretation becomes a key part of the data science process.

In summary, Week 9 gave me a strong foundation in clustering techniques,


especially K-Means and Hierarchical Clustering. I now understand how to group
unlabeled data, interpret cluster results, and apply these models to practical
business and research scenarios.
ACTIVITY LOG WEEK 10

S.n Week Day Content


o

CROSS VALIDATION TECHINIQUES


1 Day-1

2 GRID SEARCH CV &RANDOMIZED SEARCH CV


Day-2
Week-10

3 Day-3 BIAS VARIANCE TRADE OFF

FEATURE SCALING AND FEATURE


4 Day-4 ENGINEERING

MODEL SELECTION
5 Day-5
PRACTICE
Week 10 Internship Report: Dimensionality Reduction – PCA & t-SNE
Week 10 of the internship centered around Dimensionality Reduction — a critical
process in machine learning and data analysis, especially when dealing with
high-dimensional datasets. The main techniques we studied were Principal
Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding
(t-SNE). These methods help simplify datasets while preserving their essential
structures, improving both model performance and interpretability.

The week started with Principal Component Analysis (PCA), a linear technique that
transforms high-dimensional data into fewer dimensions by identifying the
directions (principal components) that maximize variance. We explored how PCA
works mathematically using eigenvalues and eigenvectors, and we used scikit-learn
and NumPy to apply PCA to datasets like the Iris dataset and MNIST digit dataset.
PCA was especially useful for visualizing clusters and patterns in reduced 2D and
3D space.

We then explored how PCA can help remove noise and multicollinearity, making our
models more efficient without significant loss of information. By plotting explained
variance ratios, we learned how to choose the right number of components to retain
most of the dataset’ s variance. This step is essential before feeding data into
clustering or classification models, especially when datasets have dozens or
hundreds of features.

Later in the week, we were introduced to t-SNE (t-Distributed Stochastic Neighbor


Embedding), a non-linear dimensionality reduction technique that is particularly
useful for visualizing high-dimensional data. We applied t-SNE to complex datasets
like handwritten digits and customer segmentation data. Though more
computationally intensive than PCA, t-SNE provided compelling 2D visualizations
that helped us better understand the underlying structure of the data.

To conclude the week, we combined dimensionality reduction with clustering. We


used PCA and t-SNE to reduce dimensions before applying K-Means or Hierarchical
Clustering and observed how visualization and model performance improved.
ACTIVITY LOG WEEK 11

S.no Week Day Content

INTRODUCTION TO NLP

1 Day-1

2
Day-2 TEST CLEANING AND TOKENIZATION

3 Week-11 Day-3 BAG OF WORDS &TF-IDF

4 Day-4
SENTIMENT ANALYSIS

5 Day-5
MINI PROJECT
Week 11 Internship Report: Model Evaluation and Cross-Validation Techniques

In Week 11 of the internship, we shifted focus to an essential phase of the machine


learning pipeline — model evaluation. This week emphasized how to assess the
performance and reliability of machine learning models using various metrics and
validation strategies. We explored accuracy, precision, recall, F1-score, ROC-AUC, and
confusion matrices, learning how to choose the appropriate evaluation metrics based
on the type of problem we’ re solving.

We began with binary classification metrics. Using datasets like customer churn and
disease prediction, we analyzed how a model could have high accuracy but still
perform poorly if the data is imbalanced. This led us to understand the importance of
using precision, recall, and F1-score in such scenarios. We also created and interpreted
confusion matrices to identify true positives, false negatives, and other performance
indicators in a classification task.

Mid-week, we were introduced to cross-validation — a powerful technique for


assessing a model’ s generalizability. We practiced k-fold cross-validation using scikit-
learn, learning how it helps reduce the risk of overfitting and provides a more robust
estimate of model performance. We also experimented with stratified k-fold for
classification tasks to ensure class balance across splits, especially in skewed
datasets.

We further explored techniques like Leave-One-Out Cross-Validation (LOOCV) and


train-test split methods. In addition, we used cross_val_score and GridSearchCV to
evaluate and fine-tune hyperparameters across different models. This taught us the
importance of combining model evaluation with optimization and validation in one
integrated workflow to improve model reliability and performance.

To wrap up the week, we worked on a mini-project involving heart disease prediction.


We trained different models, used cross-validation to compare them, and evaluated
them using ROC curves and AUC scores. This hands-on experience tied together all the
evaluation techniques we had learned and emphasized their importance in real-world
applications.
ACTIVITY LOG WEEK 12

S.n Week Day Content


o

INTRODUCTION TO NUERAL NETWORKS


1 Day-1

FORWARD AND BACK WORD PROPAGATION


2
Day-2
Week-12

3 Day-3 ACTIVATION FUNCTIONS

4 Day-4 TAINING A NUERAL NETWORK

5 Day-5
INTRO TO TNSOR FLOW AND KERAS
12 Internship Report: Ensemble Learning – Bagging, Boosting & Voting Classifiers
Week 12 of the internship introduced us to the powerful concept of ensemble
learning — a technique where multiple models are combined to achieve better
performance than any individual model could on its own. This week we focused on
three major ensemble approaches: Bagging, Boosting, and Voting. Each technique
enhances model accuracy, reduces variance, and increases robustness, especially
when working with noisy or complex Week datasets.

We began with Bagging (Bootstrap Aggregating), where we trained multiple


instances of the same model on different subsets of the dataset. The most popular
algorithm here was the Random Forest, which combines many decision trees to
reduce overfitting. We revisited Random Forest from previous weeks with a deeper
understanding of its ensemble nature and practiced applying it to both classification
and regression tasks. This helped us see how bagging improves performance by
averaging multiple predictions.

Mid-week, we transitioned into Boosting, which builds models sequentially, where


each new model attempts to correct the errors made by its predecessor. We
implemented two key boosting algorithms — AdaBoost and Gradient Boosting. We
observed how these models paid extra attention to difficult-to-classify data points,
leading to significantly improved accuracy. We also used scikit-learn’ s
GradientBoostingClassifier and XGBoost, which are widely used in Kaggle
competitions and real-world applications.

We also studied the Voting Classifier, which combines predictions from different
models (e.g., logistic regression, decision tree, and SVM) and makes a final decision
based on majority voting or averaged probabilities. This approach highlighted how
we can blend diverse models to achieve better generalization. We experimented
with hard and soft voting and compared ensemble results with individual model
performance.

To cap off the week, we worked on a project predicting loan default risk using an
ensemble of models. We performed hyperparameter tuning with GridSearchCV and
evaluated the final ensemble.
ACTIVITY LOG WEEK 13

S.n Week Day Content


o

BUILDING NUERAL NETWORKS WITH KERAS

1 Day-1

2 LOSS FUNCTIONS AND OPTIMIZERS


Day-2

Week-13 CNN INTRODUCTION FOR IMAGE DATA


3 Day-3

CNN WITH KERAS


4 Day-4

5 Day-5
PROJECT
📝
Week 13 Internship Report: Natural Language Processing (NLP)
Week 13 of the internship introduced us to the fascinating field of Natural Language
Processing (NLP), which enables machines to understand, interpret, and generate
human language. NLP plays a crucial role in applications like chatbots, sentiment
analysis, translation, and text summarization. During this week, we explored both the
theoretical concepts behind NLP and implemented hands-on projects using
real-world text data.

We began the week by understanding how textual data is processed. Topics


covered included tokenization, stop word removal, stemming, and lemmatization.
These preprocessing steps are essential to prepare raw text for machine learning.
Using libraries like NLTK and spaCy, we practiced cleaning and transforming text
into usable formats, which laid the foundation for future text-based modeling.

Next, we explored text representation techniques such as Bag of Words (BoW),


TF-IDF (Term Frequency-Inverse Document Frequency), and word embeddings. We
used these methods to convert text into numerical vectors, making it possible for
ML algorithms to process them. We applied these techniques to perform basic
sentiment analysis on movie reviews and news articles, learning how to classify text
data based on sentiment polarity or topic.

We then moved on to classification models for NLP tasks, including Naive Bayes,
Logistic Regression, and Support Vector Machines (SVM). We built and evaluated
these models using TF-IDF features to classify emails as spam or not spam. We
also explored evaluation metrics like accuracy, F1-score, and confusion matrix to
assess model effectiveness, especially when dealing with unbalanced text datasets.

To conclude the week, we implemented a mini project that involved building a


sentiment analysis tool to classify customer feedback into positive or negative
categories. We used a combination of text preprocessing, vectorization, and
classification to create the pipeline. This hands-on experience helped reinforce how
NLP techniques are used in practical, impactful ways. Overall, Week 13 was a
fascinating dive into the world of language and machines — equipping me with
essential tools for future work in AI-driven text analysis.
ACTIVITY LOG WEEK 14

S.n Week Day Content


o

PROBLEM STATEMENT DISCUSSION


1 Day-1

2 DATA COLLECTION AND EDA


Day-2

Week-14 FEATURE ENGINEERING


3 Day-3

MODEL SELECTION
4 Day-4

Wee Day-5
INITIAL MODEL BUILDING
k5
Week 14 Internship Report: Deep Learning – Neural Networks & Introduction to
Keras
Week 14 of the internship was a major milestone as we ventured into Deep Learning
— a subfield of machine learning inspired by the structure of the human brain. This
week focused on understanding the basics of neural networks, how they work, and
how to implement them using TensorFlow and its high-level API, Keras. It was both
technically challenging and highly rewarding.

We started by understanding the architecture of Artificial Neural Networks (ANNs),


including input layers, hidden layers, output layers, weights, biases, and activation
functions. We explored how forward propagation and backpropagation work to
adjust weights during training. With the help of diagrams and coding examples, we
grasped the intuition behind how neural networks “ learn” from data by minimizing
loss functions using optimizers like stochastic gradient descent (SGD) and Adam.

Mid-week, we began implementing simple neural networks using Keras. We learned


how to define models using Sequential API, add dense layers, choose appropriate
activation functions (like ReLU and sigmoid), and compile the model with a suitable
loss function and optimizer. We trained models on datasets like MNIST for
handwritten digit recognition and binary classification problems such as cancer
detection using the Breast Cancer Wisconsin dataset.

A significant portion of the week was dedicated to evaluating model performance


and addressing overfitting using validation data, dropout layers, and early stopping.
We visualized training and validation accuracy/loss over epochs using matplotlib to
understand model behavior during training. This taught us how to tune
hyperparameters like learning rate, number of neurons, and number of epochs to
improve accuracy and prevent overfitting.

To wrap up, we worked on a project to classify images of clothing using the


Fashion-MNIST dataset. This real-world project gave us experience in building a full
deep learning pipeline — from preprocessing the data to evaluating and optimizing
the model. Overall, Week 14 introduced us to the powerful world of deep learning,
and I now feel confident building and training basic neural networks for a variety of
tasks.
ACTIVITY LOG WEEK 15

S.n Week Day Content


o

MODEL EVALUATION

1 Day-1

2 HYPER PARAMETER TUNING


Day-2

Week-15
3 Day-3 FINIAL MODEL TRAINING

RESULTS AND INTERPRETATION


4 Day-4

5 Day-5
PROJECT REPORT PREPARATION
Week 15 Internship Report: Convolutional Neural Networks (CNNs) – Image
Classification
Week 15 of the internship took our deep learning journey further into the realm of
computer vision by introducing Convolutional Neural Networks (CNNs). CNNs are
designed specifically for image-related tasks and are the backbone of many
advanced visual recognition systems. This week focused on understanding how
CNNs work and applying them to solve real-world image classification problems.

We began by learning the structure of CNNs, including convolutional layers, pooling


layers, and fully connected layers. Through animations and visual examples, we
understood how filters slide over images to extract features like edges, textures, and
shapes. We also learned about activation functions like ReLU, max pooling for
dimensionality reduction, and flattening before feeding the data into dense layers.

Using TensorFlow and Keras, we implemented our first CNN model from scratch.
We trained it on the Fashion-MNIST dataset, comparing its performance with a
regular dense neural network. The CNN achieved higher accuracy and faster
convergence, which showed us the power of convolutional layers in capturing
spatial features. We also used data augmentation techniques such as rotation,
flipping, and zooming to improve model generalization.

Mid-week, we explored advanced CNN architectures like VGG16 and transfer


learning. Instead of training a model from scratch, we used pre-trained models as
feature extractors for new image classification tasks. This was particularly helpful
when working with smaller datasets. We also discussed the importance of model
interpretability and visualized feature maps to better understand how CNNs perceive
images.

To wrap up the week, we completed a mini-project on classifying dog vs. cat images
using a CNN model. We built, trained, and evaluated the model using accuracy,
confusion matrix, and visualizations of misclassified images. Overall, Week 15 gave
me a solid foundation in CNNs and their applications in computer vision. I now feel
confident designing CNN architectures and using them for image-based AI tasks.
ACTIVITY LOG WEEK 16

S.n Week Day Content


o

RESUME BUILDING FOR AI/ML

1 Day-1

2
Day-2 CREATION GIT HUB PORTFOLIO

Week-16 LINKDIN OPTIMIZATION


3 Day-3

MOCK INTERVIEW
4 Day-4

5 Day-5
FINAL PRESENTATION AND
SUBMISSION
📝 16 Internship Report: Final Project, Deployment & Internship Wrap-up
Week Week 16 marked the final stage of our AI & ML internship, where we
consolidated everything we had learned over the past 15 weeks and applied it in a
comprehensive capstone project. The week focused on the end-to-end development
of a machine learning system — from data preparation and model selection to
deployment and presentation. This was both exciting and challenging, as it tested
our grasp of the entire machine learning pipeline.
We began the week by selecting a project based on real-world datasets — options
included predicting credit risk, detecting fake news, or classifying images of skin
diseases. I chose to work on a project that predicts credit card fraud using machine
learning. I applied all essential steps including data cleaning, feature engineering,
handling class imbalance using SMOTE, and model training using ensemble
methods like Random Forest and XGBoost.

After model development, we focused on evaluating the system thoroughly using


cross-validation and performance metrics such as precision, recall, F1-score, and
AUC-ROC. Since fraud detection is an imbalanced classification problem, special
attention was given to reducing false negatives. Hyperparameter tuning was done
using GridSearchCV to improve performance. The final model showed strong
generalization capability and high recall — critical for fraud prevention.

Mid-week, we learned how to deploy our trained models. We created a simple web
interface using Flask that allowed users to upload transaction data and receive a
prediction. We containerized the project using Docker, discussed model versioning,
and uploaded it to GitHub as part of our portfolio. This experience was immensely
helpful in understanding how machine learning models are used in production
environments.

Finally, we presented our projects to mentors and peers, receiving feedback and
insights on how to further improve our work. We also reviewed key takeaways from
the internship, discussed career paths in AI and ML, and received guidance on
building a strong portfolio. Completing this internship has greatly boosted my
confidence and technical skill set, and I now feel ready to tackle real-world machine
learning challenges independently.
Conclusion:

The 16-week AI & Machine Learning internship has been an immensely rewarding
and transformative journey. Throughout this period, I have gained a deep
understanding of foundational and advanced concepts in machine learning, data
preprocessing, model building, and evaluation. From supervised learning algorithms
like linear regression and decision trees to advanced topics such as deep learning,
CNNs, NLP, and model deployment, each week offered practical knowledge that
built upon the last.

The internship emphasized not just theoretical learning but also hands-on
experience through real-world projects, guided coding exercises, and capstone
assignments. I particularly appreciated the structured progression — starting from
basic ML principles and gradually diving into more complex systems like neural
networks and ensemble models. This approach helped me solidify my skills and
apply them in practical settings.

One of the most valuable aspects was learning to think like a data scientist —
understanding the importance of data quality, choosing the right algorithms,
interpreting results, and communicating findings effectively. The final deployment
and project presentation were especially fulfilling, as they allowed me to showcase
everything I had learned in a real-world context.

Overall, this internship has not only strengthened my technical foundation in AI and
ML but has also boosted my confidence in problem-solving, critical thinking, and
independent project work. I am now equipped with the tools and mindset necessary
to pursue further studies or professional roles in machine learning and artificial
intelligence. I am deeply grateful for the guidance, mentorship, and support I
received throughout this journey, and I look forward to continuing my growth in this
exciting and impactful field.
Internal&ExternalEvaluationforSemesterInternship

Objectives:

Explorecareeralternativespriortograduation.

Toassess interestsandabilitiesinthefieldofstudy.

Todevelopcommunication,interpersonalandothercriticalskillsinthefuture job.

Toacquireadditionalskillsrequiredfortheworldofwork.

Toacquireemploymentcontactsleadingdirectlytoafull-timejo
bfollowing graduation from college.

AssessmentModel:

Thereshallbebothinternalevaluation andexternal evaluation

TheFacultyGuideassignedisin-chargeofthelearningactivitiesofthest
udentsand for the comprehensive and continuous assessment of the
students.
Theassessmentistobeconductedfor200marks.InternalEvaluationfor
50marksand External Evaluation for 150 marks
Thenumberofcreditsassignedis12.Laterthemarksshallbeconverte
dintogrades and grade points to include finally in the SGPA and
CGPA.TheweightingsforInternalEvaluationshallbe:

o ActivityLog 10 marks
o InternshipEvaluation 30 marks
o OralPresentation 10 marks
TheExternalEvaluationshallbeconductedbyanEvaluationCommitteecompri
singof thePrincipal,FacultyGuide,InternalExpertandExternalExpertnominatedbythe
affiliating University. The Evaluation Committee shall also consider the grading
given by the Supervisor of the Intern Organization.
ActivityLogistherecordoftheday-to-dayactivities.TheActivityLogisas
sessedon an individual basis, thus allowing for individual members
within groups to be assessed this way.
Whileevaluatingthestudent'sActivityLog,thefollowingshallbeconsidered-

a. Theindividualstudent'seffortand commitment.

b. Theoriginality andqualityofthework producedbytheindividual

c. Thestudent'sintegrationandco-operationwiththeworkassigned.

d. Thecompleteness oftheActivityLog.

TheInternshipEvaluationshallincludethefollowingcomponentsa
ndbasedon Weekly Reports and Outcomes Description

a. DescriptionoftheWork Environment.

b. RealTimeTechnicalSkillsacquired.

c. ManagerialSkillsacquired.

d. ImprovementofCommunication Skills
INTERNALASSESSMENTSTATEMENT

NameOftheStudent:

Program of Study: Year

of Study: Group:

RegisterNo/H.T.No:

Name of the College:

University:

SL.NO Evaluationcriterion Maximum Marks


Marks Awarded
1 ActivityLog 10
2 InternshipEvaluation 30

3 OralPresentation 10
GRANDTOTAL 50

Date: SignatureoftheFaculty

Guide
EXTERNALASSESSMENTSTATEMENT

NameOftheStudent:

Program of Study: Year

of Study: Group:

RegisterNo/H.T.No:

Name of the College:

University:

SL.NO Evaluationcriterion Maximum Marks


Marks Awarded
1 InternshipEvaluation 80
Forthegradinggivingbythe
2 supervisor of the Intern 20
Organization
3 Viva-Voce 50
TOTAL 150
GRANDTOTAL(EXT.50M+INT.100M) 200

Signatureof the FacultyGuide

Signature of the Internal Expert

Signature of the External Expert

SignatureofthePrincipalwithSeal

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy