0% found this document useful (0 votes)

10 views3 pages

Aryan 2022PH11425

PrepAccelerator is an AI-powered platform designed to enhance coding interview preparation through features like Ratings Forecaster, Problems Suggester, context-aware code completion, an LLM-based chatbot, and personalized learning paths using GNNs. The platform addresses the challenges candidates face in preparing for technical interviews by providing tailored insights and resources based on individual performance. It utilizes datasets from Leetcode and Codeforces to offer a comprehensive and adaptive learning experience.

Uploaded by

aryansudan289

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views3 pages

Aryan 2022PH11425

Uploaded by

aryansudan289

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

AIL721: Deep Learning IIT Delhi

PrepAccelerator: Internship and Placement Interview

Preparation Tool

Aryan Sudan 2022PH11425

Abstract: Software engineering interviews, especially at the entry level, focus heavily on Data Struc-
tures and Algorithms problems. Preparing for these interviews is a daunting task for many. PrepAccelerator
streamlines interview preparation using many DL-based features such as Ratings Forecaster, Problems Sug-
gester, code editior with context aware code completion in C++, Java and Python3, LLM-Based Chatbot
that asks questions in the style of Coding interviews, personalized learning path generation using GNNs.

1 Introduction
In today’s competitive job market, technical interviews play a crucial role in securing software engineering
positions. However, many candidates struggle with algorithmic problem-solving, coding efficiency, and tech-
nical communication due to a lack of structured practice. To address this, PrepAccelerator aims to develop
an AI-powered coding interview preparation platform that provides an interactive and personalized learning
experience. The features that are aimed to be integrated on this platform are:

1. Ratings Forecaster: In order to prepare for coding interviews, many candidates participate in contests
on platforms like Codeforces, Codechef and Leetcode. A dynamic numerical rating is assigned to every
participant based on the performance in the contest. Ratings forecaster will predict rating changes for
a participant before the official contest, based on contest history.

2. Problems Suggester: This feature helps users improve on weaker areas in their preparation. After a
contest, once problem tags (Topic, Level) for every problem have been updated, PS will suggest top-K
(K is subject to user setting) relevant problems to solve and improve performance in next contest.
Problems will be suggested from Leetcode and Codeforces (due to dataset availability). User can also
enter topics, difficulty, to be suggested top-K relevant problems from afore mentioned sites.

3. Context aware code completion: Self implemented feature that is found in most code editors. It
is aimed to train 3 separate models for CPP, Python and Java. Models will be either CNN, RNN or
LSTM (Subject to performance) . Uses combination of Data Structures and DL models for predictive
code completion.

4. LLM Based Chatbot: Fine tuned open source LLMs like LLaMA 2 to ask questions in the style
of a Coding interview. Evaluates answers and provides feedback to the user. This feature will help
candidates practice for real time interviews.

5. Personalized Learning Path Generation using GNNs: The GNN model predicts the next best topic
to study based on user performance. GNN ranks topics and suggests adaptive learning paths based on
user data from Leetcode and Codeforces.

2 Problem Statement and Key Challenges

2.1 Problem Statement
Candidates preparing for Data Structures and Algorithms interview rounds lack personalized learning paths
and insights. Preparation for these interview rounds isn’t ”one-size-fits-all” as not everyone has the same
strong/weak topics. For some, graph algorithms might be intuitive, but that may not be the case for other
people. There is a need for a platform that derives insights from DL models and personalizes interview
preparation.

2.2 Expected Challenges

Data Collection and Quality Acquiring high-quality, diverse, and up-to-date interview questions. Ensuring a
balance of easy, medium, and hard questions across different topics. Handling bias in datasets (ex. over-
representing certain topics or difficulty). Problems related to scalability and performance (Ensuring the
platform performs well under high user loads. Optimizing model inference speed for real-time feedback.)

3 Dataset Description
The datasets used for the Problems Suggester feature are:

1. Leetcode Problem Dataset (Kaggle): File Format: CSV. The LeetCode Problems Dataset consists
of 1,825 coding problems collected from LeetCode. This dataset contains rich metadata about each
problem, including difficulty level, company tags, problem frequency, and acceptance rates.

Feature Description
id Unique problem identifier
title Name of the coding problem
description Full text of the problem statement
is premium Indicates if a premium account is required (Boolean)
difficulty Problem difficulty (Easy, Medium, or Hard)
solution link Link to the problem’s solution
acceptance rate Percentage of correct submissions
frequency How often the problem is attempted
url Link to the problem on LeetCode
discuss count Number of discussion threads on the problem
accepted Number of times the solution was accepted
submissions Total number of submissions
companies Companies that have asked the problem
related topics Topics related to the problem (e.g., Graphs, DP)
likes Number of likes received by the problem
dislikes Number of dislikes received by the problem
likes
rating Rating score calculated as likes+dislikes
asked by faang Indicates if the problem was asked by FAANG companies
similar questions List of similar problems with metadata

2. Codeforces Dataset (Kaggle): File Format: CSV. The dataset consists of 6,819 Codeforces prob-
lems. The dataset consists of two main columns: Problem Statement and Problem Tags. Problem
Statement is the full text from the problem page. Problem Tags are comma-separated tagged classes

The dataset(s) used for context aware code completion are:

1. CodeSearchNet (GitHub): The CodeSearchNet dataset is a collection of code snippets and their
corresponding natural language descriptions, designed to train and evaluate machine learning models
for code retrieval and code completion tasks. It has over 2 million code snippets in Python, Java and
CPP.

2
Field Name Description

code The source code snippet

docstring The natural language description of the code
code tokens Tokenized version of the code
docstring tokens Tokenized version of the docstring
func name Name of the function/method
repo GitHub repository URL
path File path of the function in the repo
partition Whether the sample is for train, validation, or test

Miscellaneous Data: Fetched from Codeforces (Contest History, Rating changes etc.)

4 Tentative Proposed Methodology

4.1 Ratings Forecaster
Contest history is equivalent to handling time series data and making predictions. It is planned to explore
the various statistical and DL methods of dealing with time series data.

1. Statistical Methods: Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA),
Exponential Smoothing (ETS).

2. DL methods: LSTMs, Transformer models (Time-Series BERT, Temporal Fusion Transformer)

4.2 Problems Suggester

After each contest, utilize the updated problem tags on Codeforces to generate numerical embeddings for
the same using Word2Vec. Problems within +-200 of the user’s current rating will be used to generate
embeddings. Using this embedding, calculate cosine similarity between the embeddings for the leetcode and
codeforces problems and select the top-K similar problems which are to be displayed to the user. Similar
procedure of calculating cosine similarity is to be followed when user inputs choice topics and desired difficulty.

4.3 Context Aware Code Completion

Standard task of next word prediction. Using CodeSearchNet dataset, experiment with various models such
as CNNs, RNNs, LSTM and select the best performing model and deploy it.

4.4 LLM Based Chatbot

Fine tune LLaMa 2 to ask questions and give responses and feedback in the style of an interviewer. Using
Supervised Fine-Tuning, training LLaMa 2 on a dataset of coding interview conversations (question-answer
pairs, feedback, hints). Preparing the desired Dataset (utiling Leetcode Dataset), and ultimately utilizing
QLoRA (low-rank adapters) for efficient fine-tuning.

4.5 Personalized Learning Path Generation using GNNs

Problems are modeled as nodes and edges are between dependent topics and topics of similar difficulty.
Based on input of user embeddings (past performance, accuracy and time taken), problem embeddings
(Topics, difficulty, acceptance rate), the model recommends topics to learn.

Problem Statements For Intel Unnati Industrial Training 2025
No ratings yet
Problem Statements For Intel Unnati Industrial Training 2025
13 pages
Code Agents
No ratings yet
Code Agents
24 pages
Competition Level Code Generation With Alphacode
No ratings yet
Competition Level Code Generation With Alphacode
74 pages
Code Generation With LLMs
No ratings yet
Code Generation With LLMs
59 pages
Codeforces
No ratings yet
Codeforces
3 pages
Team13 DevRev Report
No ratings yet
Team13 DevRev Report
14 pages
Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments in Higher Education Programming Courses
No ratings yet
Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments in Higher Education Programming Courses
15 pages
Major Complete Presentation - Major Project Presentation.
No ratings yet
Major Complete Presentation - Major Project Presentation.
28 pages
Sentiment Analysis Using NLP
No ratings yet
Sentiment Analysis Using NLP
42 pages
Automatic Question & Answer Generation Using Generative Large Language Model (LLM)
No ratings yet
Automatic Question & Answer Generation Using Generative Large Language Model (LLM)
52 pages
SEM RESPOSTA - 736496689-Google-Cloud-Professional-Machine-Learning-Engineer-Exam-Questions
No ratings yet
SEM RESPOSTA - 736496689-Google-Cloud-Professional-Machine-Learning-Engineer-Exam-Questions
82 pages
OpenCoder 1731317971
No ratings yet
OpenCoder 1731317971
35 pages
Paper 1
No ratings yet
Paper 1
10 pages
Building LLM Applications For Production
100% (3)
Building LLM Applications For Production
28 pages
Frad Detection Finfinacial Transaction
No ratings yet
Frad Detection Finfinacial Transaction
8 pages
Temporary Report
No ratings yet
Temporary Report
28 pages
Amit PDF Report Train
No ratings yet
Amit PDF Report Train
20 pages
CB SC P2cse23010
No ratings yet
CB SC P2cse23010
30 pages
LLM For QnA Proposal
No ratings yet
LLM For QnA Proposal
12 pages
O C: T O C T - T C L L M: PEN Oder HE PEN Ookbook For OP IER ODE Arge Anguage Odels
No ratings yet
O C: T O C T - T C L L M: PEN Oder HE PEN Ookbook For OP IER ODE Arge Anguage Odels
35 pages
LLM Benchmarks
No ratings yet
LLM Benchmarks
5 pages
Russel Investments Prep JD
No ratings yet
Russel Investments Prep JD
14 pages
Seed Coder
No ratings yet
Seed Coder
46 pages
E1. Code Language Models
No ratings yet
E1. Code Language Models
40 pages
Guiding Large Language Models With Divide-and-Conquer Program For Discerning Problem Solving
No ratings yet
Guiding Large Language Models With Divide-and-Conquer Program For Discerning Problem Solving
18 pages
Problem Statements For KLEOS 2.0
No ratings yet
Problem Statements For KLEOS 2.0
33 pages
Code Generation 2305.10679v1
No ratings yet
Code Generation 2305.10679v1
13 pages
AI Agent UC Berkeley
No ratings yet
AI Agent UC Berkeley
14 pages
Explaining Competitive-Level Programming Solutions Using LLMs
No ratings yet
Explaining Competitive-Level Programming Solutions Using LLMs
14 pages
Lab Session1 25oct2024
No ratings yet
Lab Session1 25oct2024
29 pages
LLaMA Ankit - Rawat
No ratings yet
LLaMA Ankit - Rawat
52 pages
CHATGPT
No ratings yet
CHATGPT
12 pages
代码大模型
No ratings yet
代码大模型
18 pages
AI PRACTICAL PROJECT Parthh
No ratings yet
AI PRACTICAL PROJECT Parthh
13 pages
23CS401 Aiml Lab Manual PDF
No ratings yet
23CS401 Aiml Lab Manual PDF
55 pages
Devansh
No ratings yet
Devansh
1 page
Ali Ahmad and Rameez - Project - Proposal
No ratings yet
Ali Ahmad and Rameez - Project - Proposal
5 pages
Leveraging Large Language Models To Generate Course-Specific Semantically Annotated Learning Objects
No ratings yet
Leveraging Large Language Models To Generate Course-Specific Semantically Annotated Learning Objects
20 pages
AIML Developer - Assignment (Level 1) - 250607 - 120042
No ratings yet
AIML Developer - Assignment (Level 1) - 250607 - 120042
4 pages
INFO 7375 & Prompt Engineering For Generative AI
No ratings yet
INFO 7375 & Prompt Engineering For Generative AI
7 pages
Cluster3 Prompt Engineering Generative AI Practice Summary
No ratings yet
Cluster3 Prompt Engineering Generative AI Practice Summary
6 pages
LLM's For Code Generation
No ratings yet
LLM's For Code Generation
31 pages
Identifing Software Bugs or Not Using SMLT Model
No ratings yet
Identifing Software Bugs or Not Using SMLT Model
34 pages
Angshuman Sengupta 17
No ratings yet
Angshuman Sengupta 17
1 page
Report Sentiment Analysis Marcos Matheus
No ratings yet
Report Sentiment Analysis Marcos Matheus
12 pages
Data Science Lab-KTU
No ratings yet
Data Science Lab-KTU
5 pages
An Effective Query System Using Llms and Langchain IJERTV12IS060161
No ratings yet
An Effective Query System Using Llms and Langchain IJERTV12IS060161
4 pages
Augmenting LLMs Survey
No ratings yet
Augmenting LLMs Survey
33 pages
Updated Project's Draft Paper
No ratings yet
Updated Project's Draft Paper
5 pages
Nidhish Resume NC
No ratings yet
Nidhish Resume NC
1 page
LLM Review
No ratings yet
LLM Review
31 pages
CV Format
No ratings yet
CV Format
1 page
From GPT To BERT:: Benchmarking Large Language Models For Automated Iz Generation
No ratings yet
From GPT To BERT:: Benchmarking Large Language Models For Automated Iz Generation
2 pages
6171675-Ix Std-Artificial Intelligence - Retestpostmidtermqp
No ratings yet
6171675-Ix Std-Artificial Intelligence - Retestpostmidtermqp
6 pages
General Material
No ratings yet
General Material
16 pages
RAI AI Engineer Intern Assignments
No ratings yet
RAI AI Engineer Intern Assignments
3 pages
JNC Navig8 International Map Installation Instructions
0% (1)
JNC Navig8 International Map Installation Instructions
13 pages
Test For Agile E1 Certification TCS
100% (1)
Test For Agile E1 Certification TCS
6 pages
Sandeep Garg BST PDF Free
No ratings yet
Sandeep Garg BST PDF Free
245 pages
Somenzo Regedit
No ratings yet
Somenzo Regedit
3 pages
5069-OW16 Blinking Red Channel Lights
No ratings yet
5069-OW16 Blinking Red Channel Lights
3 pages
Raviteja Resume GD
No ratings yet
Raviteja Resume GD
2 pages
Stetson Blake - Downloaded Shell Samurai - Master The Linux Command Line-Leanpub (2023)
100% (1)
Stetson Blake - Downloaded Shell Samurai - Master The Linux Command Line-Leanpub (2023)
231 pages
02 - System Features
No ratings yet
02 - System Features
74 pages
Software Development With Visual Basic B.com Ca
No ratings yet
Software Development With Visual Basic B.com Ca
122 pages
Digital Signal Processing (ECEG-3171) : Course Description
No ratings yet
Digital Signal Processing (ECEG-3171) : Course Description
7 pages
New Latex PDF
No ratings yet
New Latex PDF
55 pages
4.0: Literature Review: Round Robin CPU Scheduling Algorithm
No ratings yet
4.0: Literature Review: Round Robin CPU Scheduling Algorithm
9 pages
Implementation of Python in Predicting The Physical
No ratings yet
Implementation of Python in Predicting The Physical
4 pages
AKTU Python 4thsem Complete Notes ReportLab
No ratings yet
AKTU Python 4thsem Complete Notes ReportLab
15 pages
Interviewbit String Level 3
No ratings yet
Interviewbit String Level 3
17 pages
Smart BMS 说明书
No ratings yet
Smart BMS 说明书
6 pages
Assembly Tips
No ratings yet
Assembly Tips
18 pages
REN R11an0073eu0112-Synergy-Ssp-Flashloader APN 20181228
No ratings yet
REN R11an0073eu0112-Synergy-Ssp-Flashloader APN 20181228
63 pages
Cover Letter For Data Manager
100% (1)
Cover Letter For Data Manager
6 pages
CAPN File Settlement Guide - v2.0
No ratings yet
CAPN File Settlement Guide - v2.0
13 pages
BCSL 044 Bcaol 2024 25
No ratings yet
BCSL 044 Bcaol 2024 25
2 pages
How To Register - V6
No ratings yet
How To Register - V6
7 pages
Virtualbox 1
No ratings yet
Virtualbox 1
25 pages
Untitled
No ratings yet
Untitled
22 pages
Trace - 2020-05-29 06 - 04 - 19 155
No ratings yet
Trace - 2020-05-29 06 - 04 - 19 155
4 pages
NavyaKulhari Resume
No ratings yet
NavyaKulhari Resume
2 pages
កិច្ចសន្យាខ្ចីប្រាក់ និងដាក់បញ្ចាំ - PDF
100% (1)
កិច្ចសន្យាខ្ចីប្រាក់ និងដាក់បញ្ចាំ - PDF
1 page
The Cyberx Platform:: Protect Your People, Production, and Profits
No ratings yet
The Cyberx Platform:: Protect Your People, Production, and Profits
11 pages
RFID Reader Programming Instruction
No ratings yet
RFID Reader Programming Instruction
3 pages
eCW Interface Sign-Off Production Server - Metropolitan Medical USDA Clinic
No ratings yet
eCW Interface Sign-Off Production Server - Metropolitan Medical USDA Clinic
3 pages
3 Mesin Perontok Padi
No ratings yet
3 Mesin Perontok Padi
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Aryan 2022PH11425

Uploaded by

Aryan 2022PH11425

Uploaded by

AIL721: Deep Learning IIT Delhi

PrepAccelerator: Internship and Placement Interview

Aryan Sudan 2022PH11425

2 Problem Statement and Key Challenges

2.2 Expected Challenges

The dataset(s) used for context aware code completion are:

code The source code snippet

4 Tentative Proposed Methodology

2. DL methods: LSTMs, Transformer models (Time-Series BERT, Temporal Fusion Transformer)

4.2 Problems Suggester

4.3 Context Aware Code Completion

4.4 LLM Based Chatbot

4.5 Personalized Learning Path Generation using GNNs

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.