NLP Report (Repaired)
NLP Report (Repaired)
ON
SUPERVISOR
[2024-25]
Mahatma Education Society’s
2024-25
Certificate
This is to certify that the ML Project Report entitled “ Foreign College
Admission Prediction” is a bonafide work of Furkan Mustafa(41)
submitted to the University of Mumbai in partial fulfillment of the
requirement for the award of the degree of “Undergraduate” in “Computer
Engineering”.
Mr.Rahul Kapse
(Supervisor)
Declaration
I declare that this written submission represents my ideas in my own
words and where others ideas or words have been included. I have
adequately cited and referenced the original sources. I also declare that I
have adhered to all principles of academic honesty and integrity have
not misrepresented or fabricated or falsified any idea/data/fact/source in
our submission. I understand that any violation of the above will
because for disciplinary action by the Institute and can also evoke penal
action from the sources which have thus not been properly cited or from
whom proper permission has not been taken when needed.
Furkan
Mustafa
Date:
S
Abstract
Abstract i
List of Figures ii
Chapter 1: Introduction
1.1 Introduction 1
1.2 Background 3
1.3 Motivation 5
Chapter 5: Conclusion 22
References 24
Chapter 1
Introduction
1.1 Introduction
1. Input Data:
The system collects data from student profiles, including academic history
(SGPA), standardized test scores (GRE, TOEFL, etc.), work experience,
extracurricular activities, and other relevant metrics.
2. Preprocessing Module:
Data preprocessing is crucial for cleaning the input data, handling missing
values, normalizing features, and encoding categorical variables like gender
or nationality. This ensures that the data is in the correct format for the
machine learning models.
1
4. Machine Learning Model:
A variety of machine learning algorithms such as Linear Regression,
Random Forest, and Neural Networks are trained on historical data to
predict admission chances. These models are evaluated using various
metrics to ensure they are accurate and reliable.
5. Prediction Output:
Once the input data is processed, the machine learning model provides a
probability score indicating the likelihood of admission to the target
institution. This score can be used by students to assess their chances and
plan their applications accordingly.
2
1.2 Background
With the growing demand for higher education in foreign institutions, students
from around the world are competing for limited seats in prestigious
universities. Admission officers rely on a variety of factors to make their
decisions, but as the number of applicants grows, manual evaluation
becomes inefficient. To address this challenge, machine learning offers a
scalable, data-driven solution that can automate and optimize the admission
prediction process.
3
4. Advances in Machine Learning and Predictive Analytics:
Machine learning techniques like regression, decision trees, and neural
networks have shown promising results in classification tasks, making them
well-suited for admission prediction. These models can analyze historical
data and identify trends, improving accuracy and efficiency.
1.3 Motivation
4
strategically. A prediction system can provide insights into where a student
has the highest chances of admission
5
Chapter 2
Literature Survey
6
2.1 Basic Terminologies:
3. Classification:
A machine learning task that assigns data to predefined categories (e.g.,
admitted or not admitted).
4. Training Data:
Historical data used to teach the machine learning model by showing input
features and known outcomes.
5. Test Data:
A dataset used to evaluate the performance of the trained model on unseen
data.
6. Feature Engineering:
The process of selecting and transforming variables (features) from raw
data to improve model performance.
7. Data Preprocessing:
Cleaning and preparing raw data for the model by handling missing values,
normalizing data, and encoding categorical features.
7
8. Cross-Validation:
A technique used to assess model performance by dividing the data into
multiple training and testing sets to ensure accuracy.
2. Admit Predictor: Admit Predictor is another widely used tool that leverages past
admission data to predict outcomes. It incorporates regression-based machine learning
models to estimate the probability of admission based on factors like GPA, GRE scores,
and TOEFL results. While the tool is useful for students to gauge their chances, it often
lacks detailed institution-specific insights, limiting its effectiveness in certain cases.
8
accurate predictions. However, their system focuses mainly on U.S. colleges and
universities, limiting its scope for predicting admission to foreign institutions.
1. Accuracy Variations: Most existing systems are limited by the quality and
amount of data available. Without comprehensive, institution-specific datasets, their
predictions can lack accuracy, especially for lesser-known schools or programs.
2. Lack of Qualitative Analysis: While some platforms like CollegeVine
consider qualitative factors such as essays and recommendations, many tools are
overly reliant on quantitative data (GPA, test scores). This leaves out important
aspects of the application process, leading to incomplete predictions.
3. Generalization Issues: Many models are trained on data from a limited set of
universities, which can cause them to generalize poorly to new or foreign institutions
with unique admission policies.
4. Inconsistent Data: User-reported data on platforms like GradCafe can be
unreliable, affecting the accuracy of the predictions.
5. Low-Resource Institutions: Predictive tools often struggle with institutions
that don’t receive a high volume of applications, as the data needed to train models
effectively is scarce for these schools.
9
2.3 Problem Statement:
10
Pillai HOC College of Engineering and Technology
11
Pillai HOC College of Engineering and Technology
12
Chapter 3
Requirement Gathering
13
3.1 Software and Hardware Requirements:
Software Requirements:
14
Hardware Requirements
GPU: Dedicated GPU (e.g., NVIDIA GeForce) for deep learning tasks.
15
Pillai HOC College of Engineering and Technology
16
Chapter 4
Plan of Project
17
4.1 Method of Work:
Needs Assessment:
Technology Selection:
Chatbot Design:
18
Model Development:
Train machine learning models using the collected data, focusing on natural
language understanding (NLU), intent recognition, and entity extraction to
improve the chatbot’s ability to comprehend user queries.
19
Fig 4.2.1 Block Diagram
20
Fig 4.2.2 Flow Diagram
21
Pillai HOC College of Engineering and Technology
22
Fig. 4.2.3 Use Case Diagram
23
Pillai HOC College of Engineering and Technology
24
Chapter 5
Conclusion
25
Conclusion
26
prioritizes real-time performance, making it suitable for applications in
diverse fields such as customer support, translation services, and content
moderation. Addressing challenges related to low-resource languages and
mixed-language inputs remains essential for enhancing the system's
reliability. Future developments may focus on expanding language support
and refining algorithms for even better performance. Ultimately, this
system contributes to bridging communication gaps in our increasingly
globalized world.
27
Pillai HOC College of Engineering and Technology
28
Pillai HOC College of Engineering and Technology
29
References
30
References
31
Pillai HOC College of Engineering and Technology
32