Visakha Institute of Engineering&Technology Computer Science and Engineering
Visakha Institute of Engineering&Technology Computer Science and Engineering
Name:ALAVALAPATISRIRAM
Year: IV Btech
Semester: II
RollNo: 22NT5A0502
Internship: AI & ML
ANINTERNSHIPPROJECTREPORTON
AI & ML
CarriedoutbyEXCELR
(2022-2025)
VISAKHAINSTITUTEOFENGINEERINGANDTECHNOLOGY
(ApprovedbyAICTENewDelhi&RecognizedbyJNTUG,VIZINAGARAM
and submitted during 2024 – 2025 for 16 weeks in this academic year, in
partial fulfilment of the requirements for the award of the degree of
K.VIJAY Dr.ASC.TEJASWINIKONE
DepartmentInternshipCoordinator HeadtheDepartment
DepartmentofCSE DepartmentofCSE
EXTERNALSIGNATURE
CERTIFICATEFROMEXCERLORGANIZATION
DECLARATION
We would like to thank Mrs. A.S.C Tejaswini Kone Madam Head Of The
Department forall the help rendered. Thank you, Dear Madam we would
like to thank you for your efforts and help provided to me to get such an
excellent opportunity. Last but not the least there were so many who
shared valuable informationthathelped in the successful completion of
this project
ALAVALAPATI . SRIRAM
22NT5A0502
AI & ML 16-WEEKSINTERNSHIP
EXCELR
ABSTRACT
Natural Language Processing (NLP) and Neural Networks were introduced in the later
stages, leading to practical applications such as sentiment analysis and image
classification using Convolutional Neural Networks (CNNs). The program culminated
with a capstone project involving model building, evaluation, and deployment strategies.
Would you like me to tailor this abstract for a specific project, platform (like Eduskills,
IBM, etc.), or add technical details (tools, datasets, etc.)?
ACTIVITY LOG FOR THE 1st WEEK
Day-3
CONDITIONAL STATEMENTS AND LOOPS
Week-1
FUNCTIONS AND LAMBDA FUNCTIONS
Day-4
Day-5
OBJECT ORIENTED PROGRAMMING BASICS
1stWEEK REPORT
During the first week of my AI & ML internship, I was introduced to the foundational
programming skills required for data science and machine learning — primarily through
Python. The focus was on building a strong base in Python syntax, data types, and
control structures. We began by setting up our development environment using tools
like Jupyter Notebook and VS Code, and then quickly progressed into writing simple
Python programs.
Each day was structured to cover specific core concepts. On Day 1, we explored basic
syntax, variables, and data types such as integers, strings, and lists. Day 2 introduced
conditional statements and loops, which are essential for writing logical programs. By
Day 3, we were working with functions, parameters, and return types, helping us write
more modular and reusable code. Day 4 focused on file handling and exception
management, while Day 5 wrapped up with object-oriented programming (OOP),
covering classes, objects, inheritance, and encapsulation.
One of the most engaging aspects of the week was the hands-on coding sessions. We
were encouraged to practice each concept by solving small problems and writing code
snippets. I particularly enjoyed solving logic-building problems that involved loops and
conditionals, as they sharpened my logical thinking and syntax familiarity. The use of
online coding platforms and GitHub for version control was also introduced, setting a
standard for professional development practices.
Additionally, we were given assignments and quizzes that helped reinforce our learning.
These included building a simple calculator using functions, creating a class-based
student grading system using OOP, and debugging code snippets. We also participated
in a group discussion on how Python’ s simplicity and versatility make it an ideal
language for AI and ML development.
In summary, Week 1 was a productive and exciting start to the internship. It laid the
groundwork for more complex topics by ensuring we are comfortable with Python,
which is essential for future machine learning algorithms and data handling. I am
confident that this foundation will be invaluable as we progress through the internship
in the coming weeks
ACTIVITY LOG FOR THE 2ndWEEK
Week-2
Day-5
HANDLING MISSING VALUES AND DUPLICATES
5
Week 2 Internship Report: Data Handling with NumPy and Pandas
The second week of the internship focused on data manipulation using two of
Python’ s most powerful libraries: NumPy and Pandas. These tools form the backbone
of data preprocessing in the machine learning pipeline. The week was dedicated to
understanding how to efficiently work with arrays, handle structured data, clean
datasets, and prepare them for analysis or modeling.
The first part of the week introduced NumPy. We learned about the creation and
manipulation of arrays, vectorized operations, array indexing, slicing, reshaping, and
broadcasting. These skills are crucial for performing mathematical computations on
large datasets. Working with multidimensional arrays helped me understand the
performance benefits of using NumPy over traditional Python lists.
In the second half of the week, we moved on to Pandas, which provides high-level data
structures like Series and DataFrames. We were taught how to load data from various
sources (CSV, Excel), explore data, and handle missing values and duplicates. These
lessons were complemented with hands-on exercises involving real-world datasets,
such as cleaning and transforming a sample dataset of customer transactions.
One of the key takeaways from this week was learning how to perform operations such
as filtering, grouping, and aggregating data using Pandas. We also explored common
techniques for handling null values, data imputation, and detecting outliers. Through
assignments and coding challenges, we were able to apply these techniques in
practical scenarios, which enhanced our understanding of real-world data processing.
Overall, Week 2 provided essential skills that form the core of any data science or
machine learning project. Understanding how to structure, clean, and manipulate data
using NumPy and Pandas has given me the confidence to tackle messy datasets and
prepare them for analysis and modeling. I'm looking forward to building on this
knowledge in the coming weeks as we move into data visualization and exploratory
data analysis (EDA).
ACTIVITY LOG FOR THE 3rdWEEK
Week-3
5 Day-5
MINI PROJECT
3rdWEEKREPORT
The week began with an introduction to Matplotlib, where we explored basic plotting
functions such as line plots, bar graphs, histograms, and scatter plots. We also learned
how to customize plots using titles, labels, legends, colors, and styles to make them
more informative and visually clear. Practical exercises included plotting sales data
over time and analyzing distribution using histograms. This hands-on approach helped
reinforce our understanding of each chart type and its use case.
On the final day of the week, we applied what we learned in a mini-project involving
exploratory data analysis (EDA). We were given a real-world dataset and asked to
generate visual insights about customer behavior. Using a combination of Matplotlib
and Seaborn, we identified key trends, correlations, and outliers in the data. This project
emphasized how visualizations can support hypothesis generation and guide further
analysis or model-building.
In summary, Week 3 was both insightful and creative. It taught me how to turn raw
numbers into visual stories that can drive decisions. Mastering data visualization has
not only improved my ability to analyze data but also enhanced how I communicate
findings with others. As we move into statistics and probability in Week 4, I feel
well-prepared to interpret and present data in a clear and meaningful way.
ACTIVITYLOGFORTHE4thWEEK
Day-5
5 HANDS ON WITH REAL DATA STATS
4thWEEKREPORT
The week started with descriptive statistics, where we explored measures of central
tendency (mean, median, and mode) and dispersion (range, variance, standard
deviation). We applied these concepts using Python libraries such as NumPy and
Pandas to analyze datasets and summarize key characteristics. This provided a clearer
understanding of how data can be quantitatively described, which is crucial for
interpreting patterns before model development.
To wrap up the week, we completed hands-on exercises and a mini case study where
we analyzed a dataset using statistical measures and visualizations. This practical
application solidified our understanding and demonstrated how statistics and
probability serve as a foundation for machine learning algorithms. Overall, Week 4 was
highly informative and prepared us to better understand model evaluation and
prediction in the upcoming weeks.
ACTIVITY LOG WEEK 5
2
Day-2 ML LIFE CYCLE AND TOOLS
Week-5
3 Day-3 TRAIN-TEST SPLIT AND SCIKIT LEARN
4 Day-4 EVALUATION
MATRICES:ACCURACY,PRESSION
5 Day-5
LINEAR REGRESSION-THEORY
5 th WEEK REPORT
We began the week by understanding what machine learning is and how it differs from
traditional programming. We explored the three main types of ML: supervised learning,
unsupervised learning, and reinforcement learning, with a focus on supervised learning
for now. The concept of a model “ learning” from data — by recognizing patterns
without being explicitly programmed — was a key point of discussion. We also covered
real-world applications like fraud detection, recommendation systems, and speech
recognition.
Mid-week, we learned about the machine learning lifecycle, including steps such as
data collection, data preprocessing, model selection, training, evaluation, and
deployment. We were introduced to essential tools such as Jupyter Notebook and the
scikit-learn library, which makes building ML models in Python more accessible. We
practiced using train-test split methods to evaluate model performance and learned
why data partitioning is important to prevent overfitting.
The week concluded with an in-depth look at the first two algorithms in supervised
learning: Linear Regression and Logistic Regression. Linear Regression was used for
predicting continuous values, such as house prices, while Logistic Regression was
used for classification tasks like predicting whether a customer will churn. We
implemented both models using real-world datasets and evaluated their performance
using metrics like accuracy, mean squared error, and confusion matrices.
2 POLYNOMIAL REGRESSION
Day-2
5 Day-5
MINI PROJECT
6 th WEEK REPORT
We began the week by revisiting Linear Regression, where we studied how to model
the relationship between a dependent variable and one or more independent
variables. Through hands-on exercises, we implemented simple and multiple linear
regression models. We evaluated their performance using metrics such as Mean
Squared Error (MSE), R-squared, and Residual Plots, which gave us a practical
understanding of model accuracy and goodness-of-fit.
Next, we studied Logistic Regression, which, despite its name, is used for
classification tasks. We explored the concept of the sigmoid function and how it
maps predicted values to probabilities. We then applied logistic regression to binary
classification problems such as predicting whether a student will pass or fail based
on study hours. We also introduced evaluation metrics like confusion matrices,
ACTIVITY LOG WEEK 7
5 Day-5
MINI PROJECT
7 th WEEKREPORT
Week 7 Internship Report: Decision Trees & Random Forest
Week 7 of the internship introduced us to tree-based models, particularly Decision
Trees and Random Forest, which are powerful and intuitive algorithms used in both
classification and regression tasks. This week was especially interesting because it
bridged the gap between simple linear models and more complex ensemble methods,
showing us how machine learning can capture non-linear patterns and make decisions
in a structured, hierarchical way.
The week began with an in-depth explanation of Decision Trees. We explored how they
split data based on features using criteria such as Gini Impurity and Entropy (for
classification) or Mean Squared Error (for regression). Through visualizations and
coding exercises, we saw how the tree “ learns” to split the data recursively, forming
a structure that mimics human decision-making. We also discussed overfitting and
how an unpruned tree can perform poorly on unseen data.
In the latter part of the week, we learned about Random Forest — an ensemble method
that builds multiple decision trees and combines their outputs for better accuracy and
robustness. We studied how bagging and random feature selection during training
helps reduce overfitting and improve performance. We implemented classification and
regression tasks using Random Forest and compared the results with single Decision
Trees to see the improvements in accuracy and stability.
To wrap up, we worked on a mini-project that involved predicting customer churn using
tree-based models. We handled feature selection, model training, tuning, and evaluation
using confusion matrices and classification reports. Overall, Week 7 gave me a deeper
appreciation for how machine learning models can be both interpretable and powerful.
With a better grasp of tree-based algorithms, I feel more equipped to build models for
real-world decision-making tasks
ACTIVITY LOG WEEK 8
KNN THEORY
1 Day-1
2
Day-2 KNN WITH SKYKIT LEARNING
Week-8
3 Day-3 SVM
5 Day-5
HANDS ON PRACTICE UCI DATA SET
8 th WEEK REPORT
Week 8 Internship Report: Support Vector Machines (SVM)
Week 8 of the AI & ML internship introduced us to one of the most powerful
supervised learning algorithms — Support Vector Machines (SVM). This algorithm
is widely used for classification problems and is known for its ability to handle both
linear and non-linear data. Throughout the week, we explored the theory behind SVM,
its mathematical foundations, and practical implementation using scikit-learn.
We began with the basic concepts of SVM: hyperplanes, support vectors, and
margin maximization. Using visual examples, we learned how SVM tries to find the
optimal boundary (hyperplane) that separates different classes with the maximum
margin. We implemented linear SVM models and applied them to datasets like the
Iris dataset, using different kernel functions such as linear, polynomial, and radial
basis function (RBF).
A key learning point this week was understanding the role of kernel functions in
handling non-linearly separable data. We experimented with different kernels and
saw how they transformed the feature space to make classification possible. We
also explored important hyperparameters such as C (regularization) and gamma (for
non-linear kernels), and how tuning these values affects model complexity and
performance.
Mid-week, we used visualization tools like decision boundaries and support vector
plots to interpret the model's decision-making process. This helped in
understanding how certain data points (support vectors) influence the position of
the hyperplane. We also covered multi-class classification using SVMs with a one-vs-
rest approach, which broadened our understanding of how SVMs can be applied
beyond binary tasks.
CLUSTERING OVERVIEW
1 Day-1
2
Day-2 K-MEANS CLUSTERING
Week-9
4 Day-4 PCA
5 Day-5
MINI PROOJECT
Week 9 Internship Report: Unsupervised Learning – Clustering (K-Means &
Hierarchical)
Week 9 of the AI & ML internship marked a shift in focus from supervised to
unsupervised learning, particularly clustering techniques. Clustering allows us to
discover patterns or groupings in datasets without predefined labels. This week, we
explored K-Means Clustering and Hierarchical Clustering, two of the most commonly
used methods for segmenting data. These techniques are essential in areas such as
customer segmentation, market analysis, and anomaly detection.
MODEL SELECTION
5 Day-5
PRACTICE
Week 10 Internship Report: Dimensionality Reduction – PCA & t-SNE
Week 10 of the internship centered around Dimensionality Reduction — a critical
process in machine learning and data analysis, especially when dealing with
high-dimensional datasets. The main techniques we studied were Principal
Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding
(t-SNE). These methods help simplify datasets while preserving their essential
structures, improving both model performance and interpretability.
The week started with Principal Component Analysis (PCA), a linear technique that
transforms high-dimensional data into fewer dimensions by identifying the
directions (principal components) that maximize variance. We explored how PCA
works mathematically using eigenvalues and eigenvectors, and we used scikit-learn
and NumPy to apply PCA to datasets like the Iris dataset and MNIST digit dataset.
PCA was especially useful for visualizing clusters and patterns in reduced 2D and
3D space.
We then explored how PCA can help remove noise and multicollinearity, making our
models more efficient without significant loss of information. By plotting explained
variance ratios, we learned how to choose the right number of components to retain
most of the dataset’ s variance. This step is essential before feeding data into
clustering or classification models, especially when datasets have dozens or
hundreds of features.
INTRODUCTION TO NLP
1 Day-1
2
Day-2 TEST CLEANING AND TOKENIZATION
4 Day-4
SENTIMENT ANALYSIS
5 Day-5
MINI PROJECT
Week 11 Internship Report: Model Evaluation and Cross-Validation Techniques
We began with binary classification metrics. Using datasets like customer churn and
disease prediction, we analyzed how a model could have high accuracy but still
perform poorly if the data is imbalanced. This led us to understand the importance of
using precision, recall, and F1-score in such scenarios. We also created and interpreted
confusion matrices to identify true positives, false negatives, and other performance
indicators in a classification task.
5 Day-5
INTRO TO TNSOR FLOW AND KERAS
12 Internship Report: Ensemble Learning – Bagging, Boosting & Voting Classifiers
Week 12 of the internship introduced us to the powerful concept of ensemble
learning — a technique where multiple models are combined to achieve better
performance than any individual model could on its own. This week we focused on
three major ensemble approaches: Bagging, Boosting, and Voting. Each technique
enhances model accuracy, reduces variance, and increases robustness, especially
when working with noisy or complex Week datasets.
We also studied the Voting Classifier, which combines predictions from different
models (e.g., logistic regression, decision tree, and SVM) and makes a final decision
based on majority voting or averaged probabilities. This approach highlighted how
we can blend diverse models to achieve better generalization. We experimented
with hard and soft voting and compared ensemble results with individual model
performance.
To cap off the week, we worked on a project predicting loan default risk using an
ensemble of models. We performed hyperparameter tuning with GridSearchCV and
evaluated the final ensemble.
ACTIVITY LOG WEEK 13
1 Day-1
5 Day-5
PROJECT
📝
Week 13 Internship Report: Natural Language Processing (NLP)
Week 13 of the internship introduced us to the fascinating field of Natural Language
Processing (NLP), which enables machines to understand, interpret, and generate
human language. NLP plays a crucial role in applications like chatbots, sentiment
analysis, translation, and text summarization. During this week, we explored both the
theoretical concepts behind NLP and implemented hands-on projects using
real-world text data.
We then moved on to classification models for NLP tasks, including Naive Bayes,
Logistic Regression, and Support Vector Machines (SVM). We built and evaluated
these models using TF-IDF features to classify emails as spam or not spam. We
also explored evaluation metrics like accuracy, F1-score, and confusion matrix to
assess model effectiveness, especially when dealing with unbalanced text datasets.
MODEL SELECTION
4 Day-4
Wee Day-5
INITIAL MODEL BUILDING
k5
Week 14 Internship Report: Deep Learning – Neural Networks & Introduction to
Keras
Week 14 of the internship was a major milestone as we ventured into Deep Learning
— a subfield of machine learning inspired by the structure of the human brain. This
week focused on understanding the basics of neural networks, how they work, and
how to implement them using TensorFlow and its high-level API, Keras. It was both
technically challenging and highly rewarding.
MODEL EVALUATION
1 Day-1
Week-15
3 Day-3 FINIAL MODEL TRAINING
5 Day-5
PROJECT REPORT PREPARATION
Week 15 Internship Report: Convolutional Neural Networks (CNNs) – Image
Classification
Week 15 of the internship took our deep learning journey further into the realm of
computer vision by introducing Convolutional Neural Networks (CNNs). CNNs are
designed specifically for image-related tasks and are the backbone of many
advanced visual recognition systems. This week focused on understanding how
CNNs work and applying them to solve real-world image classification problems.
Using TensorFlow and Keras, we implemented our first CNN model from scratch.
We trained it on the Fashion-MNIST dataset, comparing its performance with a
regular dense neural network. The CNN achieved higher accuracy and faster
convergence, which showed us the power of convolutional layers in capturing
spatial features. We also used data augmentation techniques such as rotation,
flipping, and zooming to improve model generalization.
To wrap up the week, we completed a mini-project on classifying dog vs. cat images
using a CNN model. We built, trained, and evaluated the model using accuracy,
confusion matrix, and visualizations of misclassified images. Overall, Week 15 gave
me a solid foundation in CNNs and their applications in computer vision. I now feel
confident designing CNN architectures and using them for image-based AI tasks.
ACTIVITY LOG WEEK 16
1 Day-1
2
Day-2 CREATION GIT HUB PORTFOLIO
MOCK INTERVIEW
4 Day-4
5 Day-5
FINAL PRESENTATION AND
SUBMISSION
📝 16 Internship Report: Final Project, Deployment & Internship Wrap-up
Week Week 16 marked the final stage of our AI & ML internship, where we
consolidated everything we had learned over the past 15 weeks and applied it in a
comprehensive capstone project. The week focused on the end-to-end development
of a machine learning system — from data preparation and model selection to
deployment and presentation. This was both exciting and challenging, as it tested
our grasp of the entire machine learning pipeline.
We began the week by selecting a project based on real-world datasets — options
included predicting credit risk, detecting fake news, or classifying images of skin
diseases. I chose to work on a project that predicts credit card fraud using machine
learning. I applied all essential steps including data cleaning, feature engineering,
handling class imbalance using SMOTE, and model training using ensemble
methods like Random Forest and XGBoost.
Mid-week, we learned how to deploy our trained models. We created a simple web
interface using Flask that allowed users to upload transaction data and receive a
prediction. We containerized the project using Docker, discussed model versioning,
and uploaded it to GitHub as part of our portfolio. This experience was immensely
helpful in understanding how machine learning models are used in production
environments.
Finally, we presented our projects to mentors and peers, receiving feedback and
insights on how to further improve our work. We also reviewed key takeaways from
the internship, discussed career paths in AI and ML, and received guidance on
building a strong portfolio. Completing this internship has greatly boosted my
confidence and technical skill set, and I now feel ready to tackle real-world machine
learning challenges independently.
Conclusion:
The 16-week AI & Machine Learning internship has been an immensely rewarding
and transformative journey. Throughout this period, I have gained a deep
understanding of foundational and advanced concepts in machine learning, data
preprocessing, model building, and evaluation. From supervised learning algorithms
like linear regression and decision trees to advanced topics such as deep learning,
CNNs, NLP, and model deployment, each week offered practical knowledge that
built upon the last.
The internship emphasized not just theoretical learning but also hands-on
experience through real-world projects, guided coding exercises, and capstone
assignments. I particularly appreciated the structured progression — starting from
basic ML principles and gradually diving into more complex systems like neural
networks and ensemble models. This approach helped me solidify my skills and
apply them in practical settings.
One of the most valuable aspects was learning to think like a data scientist —
understanding the importance of data quality, choosing the right algorithms,
interpreting results, and communicating findings effectively. The final deployment
and project presentation were especially fulfilling, as they allowed me to showcase
everything I had learned in a real-world context.
Overall, this internship has not only strengthened my technical foundation in AI and
ML but has also boosted my confidence in problem-solving, critical thinking, and
independent project work. I am now equipped with the tools and mindset necessary
to pursue further studies or professional roles in machine learning and artificial
intelligence. I am deeply grateful for the guidance, mentorship, and support I
received throughout this journey, and I look forward to continuing my growth in this
exciting and impactful field.
Internal&ExternalEvaluationforSemesterInternship
Objectives:
Explorecareeralternativespriortograduation.
Toassess interestsandabilitiesinthefieldofstudy.
Todevelopcommunication,interpersonalandothercriticalskillsinthefuture job.
Toacquireadditionalskillsrequiredfortheworldofwork.
Toacquireemploymentcontactsleadingdirectlytoafull-timejo
bfollowing graduation from college.
AssessmentModel:
TheFacultyGuideassignedisin-chargeofthelearningactivitiesofthest
udentsand for the comprehensive and continuous assessment of the
students.
Theassessmentistobeconductedfor200marks.InternalEvaluationfor
50marksand External Evaluation for 150 marks
Thenumberofcreditsassignedis12.Laterthemarksshallbeconverte
dintogrades and grade points to include finally in the SGPA and
CGPA.TheweightingsforInternalEvaluationshallbe:
o ActivityLog 10 marks
o InternshipEvaluation 30 marks
o OralPresentation 10 marks
TheExternalEvaluationshallbeconductedbyanEvaluationCommitteecompri
singof thePrincipal,FacultyGuide,InternalExpertandExternalExpertnominatedbythe
affiliating University. The Evaluation Committee shall also consider the grading
given by the Supervisor of the Intern Organization.
ActivityLogistherecordoftheday-to-dayactivities.TheActivityLogisas
sessedon an individual basis, thus allowing for individual members
within groups to be assessed this way.
Whileevaluatingthestudent'sActivityLog,thefollowingshallbeconsidered-
a. Theindividualstudent'seffortand commitment.
c. Thestudent'sintegrationandco-operationwiththeworkassigned.
d. Thecompleteness oftheActivityLog.
TheInternshipEvaluationshallincludethefollowingcomponentsa
ndbasedon Weekly Reports and Outcomes Description
a. DescriptionoftheWork Environment.
b. RealTimeTechnicalSkillsacquired.
c. ManagerialSkillsacquired.
d. ImprovementofCommunication Skills
INTERNALASSESSMENTSTATEMENT
NameOftheStudent:
of Study: Group:
RegisterNo/H.T.No:
University:
3 OralPresentation 10
GRANDTOTAL 50
Date: SignatureoftheFaculty
Guide
EXTERNALASSESSMENTSTATEMENT
NameOftheStudent:
of Study: Group:
RegisterNo/H.T.No:
University:
SignatureofthePrincipalwithSeal