0% found this document useful (0 votes)

6 views15 pages

Sahil Pahuja End Term Seminar Report 1

The document is an internship report by Sahil Pahuja detailing his work on AI-driven systems for urban mobility and speech processing during his internship at Carnot Research Private Limited. It outlines the development of an intelligent transport information platform and an advanced speaker diarization system, highlighting the use of Python, SQL, and various AI technologies. The report emphasizes the integration of cutting-edge tools and methodologies to enhance operational efficiency and user experience in real-time applications.

Uploaded by

Sahil Pahuja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views15 pages

Sahil Pahuja End Term Seminar Report 1

Uploaded by

Sahil Pahuja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

AI-Driven Multimodal Journey Planning and Document

Intelligence: An Industry Internship Experience

(IT 764: Seminar and Presentation Report)
submitted in partial fulfillment of the requirements
for the award of the degree of

MASTER OF COMPUTER APPLICATIONS

(SOFTWARE ENGINEERING)
Submitted by

SAHIL PAHUJA
(01216404523)
Under the supervision of

Dr. SONAM
(ASSISTANT PROFESSOR)

UNIVERSITY SCHOOL OF INFORMATION, COMMUNICATION &

TECHNOLOGY
GURU GOBIND SINGH INDRAPRASTHA UNIVERSITY
Dwarka, Sector - 16 C, New Delhi - 110078
May/June 2025
DECLARATION

This is to certify that I Sahil Pahuja, Enrollment No. 01216404523 of MCA(SE) 4th sem, from
USICT, GGSIPU Delhi takes full responsibility for the content of this dissertation, source code
and relevant modules accountable for the work done by me.

I hereby declare that the dissertation / internship report or any other reports of the course work,
source code and the relevant modules are not plagiarized from any source directly or indirectly.
If any plagiarism is found I shall be responsible for the same.

I hereby further certify that:

1) The work contained in the dissertation is original and has been done by me.
2) The work has not been submitted to any other Institution for any other
degree/diploma/certificate in this university/other university or any organization, etc. of
India or abroad.
3) I have followed the standard guidelines in writing the dissertation.
4) I hereby declare that I will not upload or publish this dissertation in any of the online or
offline forums.
5) Whenever I have used the materials (data, theoretical analysis and text) from other
sources, I have given due credits to them in the text of the reports and given their details
in the references.

Date: 13 / 03 / 2025
New Delhi

Signature:

Name: Sahil Pahuja

Enrollment No: 01216404523
Phone No. 8586800215
EmailId : sahilpahuja123456@gmail.com
MCA 4th Sem

USICT, GGSIPU, New Delhi

CERTIFICATE FROM THE COMPANY

This is to certify that Sahil Pahuja (Enrollment No. 01216404523) has undertaken internship at
Carnot Research Private Limited from 3rd February 2025 to 3rd August 2025 under the
supervision and guidance of Mr. Pranav Kanire, Senior Developer.

I hereby confirm that the performance and conduct of Sahil Pahuja during the internship period
have been satisfactory.

Col (Dr) Amit Oberoi

CEO
Carnot Research Private Limited Seal of the Company
CERTIFICATE FROM THE INSTITUTE

This is to certify that the work embodied in this Internship Report titled “AI-Driven
Multimodal Journey Planning and Document Intelligence: An Industry Internship
Experience” being submitted in partial fulfillment of the requirements for the award of the
degree of Master of Computer Applications (Software Engineering), is original and has been
carried out by Sahil Pahuja (Enrollment No. 01216404523) under my supervision and guidance.

It is further certified that this Internship work has not been submitted in full or in part to this
university or any other university for the award of any other degree or diploma to the best of my
knowledge and belief.

Dr. Sonam
Assistant Professor
USICT, GGSIPU
INDEX
ABSTRACT............................................................................................................................................................... 6
TOOLS & TECHNOLOGIES................................................................................................................................. 7
1. Python Ecosystem.............................................................................................................................................7
Core Python and Its Versatility...................................................................................................................... 7
Deep Learning and Machine Learning Frameworks......................................................................................7
Specialized Libraries for Speech and Audio Processing................................................................................7
Integration and Application............................................................................................................................8
2. SQL Database Management............................................................................................................................. 9
Role and Benefits of SQL.............................................................................................................................. 9
Database Design and Query Optimization..................................................................................................... 9
3. Git & GitHub.................................................................................................................................................... 9
Git – The Distributed Version Control System.............................................................................................. 9
GitHub – Cloud-Based Collaboration Platform............................................................................................. 9
Integration in the Project Workflow.............................................................................................................10
Collaborative Best Practices........................................................................................................................ 10
4. Large Language Models (LLMs)....................................................................................................................10
Key Characteristics and Applications.......................................................................................................... 10
Fine-Tuning for Domain-Specific Use.........................................................................................................10
5. Retrieval-Augmented Generation (RAG)....................................................................................................... 11
Concept and Functionality............................................................................................................................11
Benefits and Use Cases................................................................................................................................ 11
6. Langchain........................................................................................................................................................11
Concept and Functionality............................................................................................................................11
FUTURE SCOPE.................................................................................................................................................... 13
CONCLUSION........................................................................................................................................................14
REFERENCES........................................................................................................................................................ 15
ABSTRACT

This internship report details the comprehensive design, development, and evaluation of
innovative AI-driven systems addressing significant challenges in urban mobility and speech
processing. The first system is an intelligent transport information platform that integrates
real-time data from multiple transport APIs with a robust SQL-based database. By leveraging
advanced techniques such as Retrieval-Augmented Generation (RAG) and fine-tuned Large
Language Models (LLMs), this platform delivers dynamic transit information, accurate route
recommendations, and timely updates on schedules, fares, and disruptions, ultimately enhancing
commuter experience and accessibility.

The second system focuses on advanced speaker diarization aimed at improving transcription
accuracy in multi-speaker environments. Utilizing state-of-the-art speech recognition models
including OpenAI’s Whisper and speaker embedding techniques from SpeechBrain, this solution
employs sophisticated clustering algorithms like K-Means alongside robust audio processing
tools (Librosa, PyDub, and SciPy) to effectively differentiate between overlapping voices.
Detailed processes such as feature extraction, noise reduction, and voice activity detection ensure
the generation of clear and structured transcripts, which are crucial for applications like meeting
transcriptions, interviews, and broadcast media.

Throughout this report, the integration of cutting-edge technologies is emphasized. Python serves
as the backbone for backend development, supported by its rich ecosystem of libraries such as
TensorFlow and PyTorch for machine learning and deep learning applications. Additionally, the
report touches on the importance of modern development tools and practices, including version
control with Git and GitHub, which streamline collaborative software engineering efforts. The
synthesis of these technologies not only enhances operational efficiency and accuracy but also
demonstrates the transformative impact of AI-driven solutions in addressing real-world
challenges in both transport automation and speech processing.

Overall, this comprehensive analysis underscores the significant advancements achieved during
the internship, providing valuable insights into the deployment of scalable, real-time AI
applications that are poised to redefine modern urban mobility and communication systems.
TOOLS & TECHNOLOGIES

1.Python Ecosystem
Python stands as the backbone of modern AI and software development, thanks to its readability,
extensive libraries, and a supportive community. In this project, Python is not only used as a
programming language but also as an ecosystem enriched by various specialized libraries and
frameworks that streamline the development process. Below is an overview of these key
components:

Core Python and Its Versatility

Python’s simplicity and versatility make it the preferred language for rapid prototyping and
production-level applications alike. It's clear syntax and dynamic typing allow developers to
write efficient code quickly while maintaining readability. Moreover, Python's extensive standard
library supports many basic operations without the need for external dependencies, which is
especially beneficial in data processing and integration tasks.

Deep Learning and Machine Learning Frameworks

● TensorFlow:
TensorFlow is an open-source deep learning framework developed by Google. It offers
an extensive range of tools for building and training machine learning models. In our
context, TensorFlow is used to design neural networks that underpin complex AI
functionalities such as image recognition and predictive analytics. Its flexible architecture
enables deployment across various platforms, from desktops to mobile devices.

● PyTorch:
Developed by Facebook’s AI Research lab, PyTorch is another leading deep learning
framework known for its dynamic computation graph, which is particularly advantageous
during model experimentation and debugging. PyTorch’s intuitive interface and robust
community support have made it a popular choice for research and production alike. In
this project, PyTorch is leveraged for tasks requiring rapid model iteration and seamless
integration with custom machine learning workflows.

Specialized Libraries for Speech and Audio Processing

● OpenAI Whisper:
OpenAI Whisper is a state-of-the-art automatic speech recognition (ASR) model that
converts spoken language into text with high accuracy. Its robust design allows it to
handle diverse accents and noisy backgrounds, making it ideal for applications like
speaker diarization. Whisper plays a central role in transcribing audio data accurately,
which is critical for downstream natural language processing tasks.

● SpeechBrain:
SpeechBrain is an open-source toolkit designed for various speech processing tasks,
including speaker recognition, speech enhancement, and more. It offers pre-trained
models and flexible modules that simplify the process of integrating advanced audio
processing functionalities. In our system, SpeechBrain supports the extraction of speaker
embeddings, which are then used in clustering algorithms to differentiate between voices.

● Librosa, PyDub, and SciPy:

These libraries collectively empower developers to perform detailed audio signal
processing. Librosa is widely used for music and audio analysis, offering a range of tools
for feature extraction such as Mel-frequency cepstral coefficients (MFCCs). PyDub
simplifies the handling of audio file formats and conversions, while SciPy provides
essential scientific computing tools to perform signal processing tasks like filtering and
Fourier analysis. Together, they create a robust pipeline for preparing audio data for
analysis and model training.

Integration and Application

The Python ecosystem's modularity allows for seamless integration of these libraries into a
unified framework. By combining TensorFlow and PyTorch for model training with specialized
audio processing libraries, developers can build sophisticated AI-driven applications. This
ecosystem not only supports rapid prototyping but also ensures that systems are scalable and
maintainable in production environments. The flexibility to switch or integrate additional
libraries as project requirements evolve is one of Python’s strongest assets.

The overall strategy is to utilize Python as the glue that binds various AI components
together—from data ingestion and preprocessing to model training and deployment. This
approach ensures that the system remains agile, allowing for continuous improvements and
updates as new tools and techniques emerge in the field.
2.SQL Database Management
Structured Query Language (SQL) is critical for managing and retrieving organized data
efficiently. In our project, SQL is used to maintain a robust, relational database that stores
historical transit information, real-time updates, and user query logs.

Role and Benefits of SQL

SQL databases provide a systematic and efficient way to store data in structured tables, which
can be easily queried using standardized commands. This structure ensures data integrity,
reduces redundancy, and enables rapid access to large datasets—essential for applications
requiring real-time responses. The SQL-based system supports efficient data retrieval, which is
vital when the AI-driven models need to pull contextual information for tasks such as route
planning and schedule updates.

Database Design and Query Optimization

Proper database design is crucial to handle the large volumes of data generated by transport APIs
and user interactions. The design involves creating well-indexed tables that support fast queries
and ensure scalability. Query optimization techniques, such as caching frequently accessed data
and using stored procedures, are implemented to minimize latency and improve the overall
responsiveness of the system.

3.Git & GitHub

Version control is an indispensable part of modern software engineering, and Git combined with
GitHub forms the cornerstone of collaborative development practices in this project.

Git – The Distributed Version Control System

Git is a powerful distributed version control system that allows multiple developers to work on
the same codebase simultaneously. It tracks changes meticulously, enabling developers to revert
to previous versions when necessary, and supports branching and merging to facilitate parallel
development. Git’s ability to handle distributed workflows makes it ideal for both individual
developers and large teams.

GitHub – Cloud-Based Collaboration Platform

GitHub builds on Git’s functionalities by providing a web-based interface for repository hosting,
issue tracking, and collaborative code reviews. It facilitates seamless integration with continuous
integration and deployment (CI/CD) pipelines, ensuring that code changes are automatically
tested and deployed. GitHub’s pull request mechanism allows team members to propose changes
and review code collaboratively, fostering an environment of code quality and collective
ownership.

Integration in the Project Workflow

In this project, Git and GitHub are used extensively to manage the entire lifecycle of code
development—from initial prototyping to final deployment. Each feature or bug fix is developed
in isolated branches, which are then reviewed and merged into the main branch only after
rigorous testing. This practice not only helps in maintaining code integrity but also makes it
easier to track progress and pinpoint issues. The transparency and accountability provided by
GitHub’s issue tracking and commit history play a crucial role in ensuring that the project
remains on schedule and meets quality standards.

Collaborative Best Practices

The use of Git and GitHub promotes several best practices such as code reviews, regular
commits, and detailed documentation of changes. These practices are essential for maintaining
high-quality code, especially in complex projects involving multiple technologies. Additionally,
GitHub’s collaboration tools make it easier for new team members to onboard and understand the
project structure quickly.

4.Large Language Models (LLMs)

Large Language Models (LLMs) are sophisticated AI models designed to understand, generate,
and translate human-like text. In our project, LLMs serve as the backbone for natural language
processing tasks, including query understanding and dynamic response generation.

Key Characteristics and Applications

LLMs are trained on vast amounts of textual data, which enables them to generate coherent and
contextually relevant responses. They are capable of performing a wide range of tasks—from
answering questions and summarizing content to facilitating conversational interactions in
chatbots. Their adaptability makes them ideal for applications in customer service, content
generation, and real-time query processing.

Fine-Tuning for Domain-Specific Use

To enhance performance in specific domains, LLMs can be fine-tuned on specialized datasets.

This customization improves their accuracy and contextual understanding, enabling them to
handle industry-specific jargon and scenarios. In our project, fine-tuning ensures that the model
not only understands the nuances of urban mobility and transit information but also provides
precise, context-aware responses that improve user experience.

5.Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is an advanced framework that enhances the
performance of LLMs by integrating external knowledge sources during response generation.

Concept and Functionality

RAG combines traditional language modeling with real-time data retrieval techniques. When a
query is made, the model searches through external databases and knowledge repositories to
fetch relevant information before generating a final response. This hybrid approach minimizes
the risk of hallucination (i.e., generating factually incorrect information) and significantly
improves the accuracy and reliability of AI responses.

Benefits and Use Cases

The primary benefit of RAG is its ability to provide contextually rich and factually correct
information by grounding AI-generated content in external data. This is particularly valuable in
applications where precision is crucial, such as real-time transit information and customer
support systems. By leveraging RAG, our system ensures that users receive timely, accurate, and
comprehensive responses to their queries, thereby enhancing overall operational efficiency.

6.Langchain
Retrieval-Augmented Generation (RAG) is an advanced framework that enhances the
performance of LLMs by integrating external knowledge sources during response generation.

Concept and Functionality

LangChain is a powerful open-source framework designed to simplify the development of

applications powered by Large Language Models (LLMs). It abstracts away the complexity of
integrating LLMs with various data sources, tools, and user interfaces by providing modular
components for chaining together language model operations, memory, prompts, and retrieval
systems. In this project, LangChain is used to create structured workflows that combine
document retrieval, context-aware generation, and external tool invocation, streamlining the
development of intelligent assistants and planners.

Modular Design and Components

LangChain’s architecture is based on the concept of “chains”—pipelines composed of various

interconnected modules:

● Prompt Templates: Predefined and dynamically formatted prompts to maintain consistent input
for the LLMs.

● LLM Wrappers: Unified interfaces to interact with models like OpenAI GPT, Cohere, or
Anthropic.

● Retrievers: Components that pull relevant chunks from external documents or vector stores.

● Tools and Agents: Used to execute actions like search, computation, or API calling based on
LLM decisions.

In our system, LangChain serves as the backbone for orchestrating a complex flow where a user
query is parsed, relevant transport or document data is retrieved, and a response is generated using
an LLM enhanced by contextual information.

Integration in the Journey Planning System

LangChain plays a pivotal role in enhancing the retrieval-augmented generation (RAG) pipeline:

● It connects the retriever (which fetches route, schedule, and fare data from a SQL-based
backend or external API) with the LLM, ensuring grounded, real-time answers.

● Using memory components, the system retains previous user queries to offer conversational
continuity and context-aware suggestions.

● Through agent-based architectures, the assistant can trigger external APIs or databases
conditionally—e.g., fetching live metro status or travel time predictions based on user input.

Overall, LangChain acts as a middleware layer that enhances the flexibility, reusability, and
maintainability of LLM-powered systems, making it an indispensable tool in the development of
intelligent transport assistants and document analysis solutions.
FUTURE SCOPE

● Enhanced Real-Time Data Integration:

The current system can be expanded by integrating additional real-time data sources, such as
traffic patterns, weather conditions, and public transport schedules. This would enhance the
accuracy and reliability of urban mobility solutions.

● Improved Speech Diarization Models:

Future iterations can incorporate more advanced deep learning models to enhance the accuracy
of speaker identification and separation, even in noisy environments. This would make the
speech processing system more robust and effective for real-world applications.

● Multilingual Speech Processing:

Expanding the system to support multiple languages would make it more inclusive and globally
applicable. Incorporating multilingual LLMs and fine-tuning RAG models for language-specific
contexts could significantly broaden its usability.

● Predictive Analytics for Urban Mobility:

By incorporating machine learning algorithms, the mobility solution could predict future transit
delays, traffic congestion, and optimal routes. This would improve the accuracy of travel
recommendations.

● Automated Error Correction in Speech Transcription:

Introducing post-processing techniques using NLP models can help detect and correct
transcription errors automatically, enhancing the overall accuracy of the speech processing
system.

● Integration with IoT Devices:

The system could be integrated with IoT-based mobility devices (e.g., GPS trackers, smart traffic
signals) to provide more accurate and real-time updates, further enhancing urban mobility
solutions.
CONCLUSION

The internship provided a unique opportunity to integrate a diverse range of technologies—from

the dynamic Python ecosystem and robust SQL databases to collaborative tools like Git &
GitHub—with advanced AI methodologies such as Large Language Models (LLMs) and
Retrieval-Augmented Generation (RAG). This hands-on experience allowed for the development
of systems that deliver real-time, context-aware transit updates and precise, structured audio
transcriptions, directly addressing the challenges of urban mobility and speech processing.

Throughout the internship, practical challenges were met with innovative solutions, illustrating
how modern software engineering practices can drive tangible improvements. The integration of
various technologies not only enhanced operational efficiency but also ensured scalability and
adaptability in real-world applications. This period of intensive learning and development
reinforced the value of cross-disciplinary collaboration, where theoretical knowledge was
effectively translated into practical, deployable systems.

The experience gained during this internship lays a robust foundation for future advancements.
The methodologies refined during this project—ranging from fine-tuning deep learning models
to managing complex data workflows—will serve as a cornerstone for further innovation in
AI-driven applications. As urban and communication needs continue to evolve, the skills and
insights developed here will be instrumental in driving continuous improvement and
transformative change in the field.
REFERENCES

Retrieval-augmented Generation | https://research.ibm.com/blog/retrieval-augmented-generation-RA/

1. Python Documentation | https://docs.python.org/
2. Large Language Models | https://www.ibm.com/think/topics/large-language-models/
3. Git Documentation | https://git-scm.com/docs/git?utm_source=chatgpt.com.
4. GitHub Docs | About Git and Github
|https://docs.github.com/en/get-started/start-your-journey/about-github-and-git
5. Structured Query Language | https://dev.mysql.com/doc/
6. Langchain | https://www.langchain.com/

Final Internship Report MAAN
No ratings yet
Final Internship Report MAAN
32 pages
AI Integration in MC
No ratings yet
AI Integration in MC
112 pages
Google Ai-Ml Report-5j6
No ratings yet
Google Ai-Ml Report-5j6
52 pages
Final Ayush Report Internship
No ratings yet
Final Ayush Report Internship
49 pages
AI Report
No ratings yet
AI Report
37 pages
Internship Final Report Chinni - NEW VERSIONS
No ratings yet
Internship Final Report Chinni - NEW VERSIONS
68 pages
Internship Report
No ratings yet
Internship Report
40 pages
Final Report
No ratings yet
Final Report
86 pages
It Report
No ratings yet
It Report
34 pages
Aiml Project Report
No ratings yet
Aiml Project Report
32 pages
B Tech Report Format Latex Final 2-1-2025
No ratings yet
B Tech Report Format Latex Final 2-1-2025
79 pages
B Tech Report Format Latex Final 2-1-2025
No ratings yet
B Tech Report Format Latex Final 2-1-2025
88 pages
Artificial Intelligence & Machine Learning: " " at "Excelr - Apsche - Raised Excellence"
No ratings yet
Artificial Intelligence & Machine Learning: " " at "Excelr - Apsche - Raised Excellence"
3 pages
Dhanush 23
No ratings yet
Dhanush 23
30 pages
Santosh tr.1703853970598
No ratings yet
Santosh tr.1703853970598
17 pages
Internship Report
No ratings yet
Internship Report
12 pages
Voice Assistant Using Python: Suthar Parva Ajaybhai
No ratings yet
Voice Assistant Using Python: Suthar Parva Ajaybhai
27 pages
Major Merged
No ratings yet
Major Merged
67 pages
Internship Report
No ratings yet
Internship Report
21 pages
Rapport ISTIC 2023 2024 Ilef Tasnim
No ratings yet
Rapport ISTIC 2023 2024 Ilef Tasnim
94 pages
SHAN - Intern Report (2025)
No ratings yet
SHAN - Intern Report (2025)
40 pages
Final Major Report
No ratings yet
Final Major Report
59 pages
Seminar & Presentation Synopsis (IT - 764) : Software Engineering Tools
No ratings yet
Seminar & Presentation Synopsis (IT - 764) : Software Engineering Tools
7 pages
Updated Internship Front Page
No ratings yet
Updated Internship Front Page
13 pages
AI ML Report
No ratings yet
AI ML Report
24 pages
Internship Front Pages Final
No ratings yet
Internship Front Pages Final
9 pages
Mini Project DSB Da
No ratings yet
Mini Project DSB Da
19 pages
Internship - Report MONICA Finall
No ratings yet
Internship - Report MONICA Finall
37 pages
Final Submisson
No ratings yet
Final Submisson
84 pages
Madhan Pending Project
No ratings yet
Madhan Pending Project
29 pages
Document From Arnab Bhattacharya
No ratings yet
Document From Arnab Bhattacharya
42 pages
25june Final - Merged
No ratings yet
25june Final - Merged
64 pages
Sachin
No ratings yet
Sachin
28 pages
AK Report Part 2
No ratings yet
AK Report Part 2
9 pages
Internship Report - 020
No ratings yet
Internship Report - 020
32 pages
Generation of Smart Text Book Using AI Documentation
No ratings yet
Generation of Smart Text Book Using AI Documentation
43 pages
Ai ML Virtual Internship
No ratings yet
Ai ML Virtual Internship
51 pages
Internship Project
No ratings yet
Internship Project
35 pages
Internshipreport
No ratings yet
Internshipreport
17 pages
320 Cohort 9 Report Final
No ratings yet
320 Cohort 9 Report Final
46 pages
Iot (Bidirectional Counter and Logger)
100% (1)
Iot (Bidirectional Counter and Logger)
49 pages
Abstract Internship
No ratings yet
Abstract Internship
5 pages
Internship Report: Chat Bot
No ratings yet
Internship Report: Chat Bot
16 pages
Guru Intership Report 1
No ratings yet
Guru Intership Report 1
40 pages
Web-e-Stan 2.0 Problem Statements
No ratings yet
Web-e-Stan 2.0 Problem Statements
3 pages
An Industrial Training Report On: Ai - ML Internship
No ratings yet
An Industrial Training Report On: Ai - ML Internship
17 pages
Infosys Pragathi Report
No ratings yet
Infosys Pragathi Report
68 pages
Final Intern Rep
No ratings yet
Final Intern Rep
16 pages
Mahesh
No ratings yet
Mahesh
16 pages
GOOGLE AIML Report
No ratings yet
GOOGLE AIML Report
43 pages
Ram Report
No ratings yet
Ram Report
35 pages
Internship Report: A Report Submitted in Partial Fulfillment of The Requirements of
No ratings yet
Internship Report: A Report Submitted in Partial Fulfillment of The Requirements of
19 pages
Ai Seminar
No ratings yet
Ai Seminar
13 pages
B2 Salma Fayaz
No ratings yet
B2 Salma Fayaz
56 pages
Report Final
No ratings yet
Report Final
21 pages
Internshipreport FINAL441
No ratings yet
Internshipreport FINAL441
14 pages
Dog Breed Recognizer System: A Project Report On
No ratings yet
Dog Breed Recognizer System: A Project Report On
77 pages
Appu Certoi
No ratings yet
Appu Certoi
8 pages
Summer Internship Report: Bachelor of Technology
No ratings yet
Summer Internship Report: Bachelor of Technology
38 pages
Lessons From Large-Scale Machine Learning Deployments On Spark
No ratings yet
Lessons From Large-Scale Machine Learning Deployments On Spark
105 pages
Positive Artificial Intelligence in Education (P AIED) : A Roadmap
100% (1)
Positive Artificial Intelligence in Education (P AIED) : A Roadmap
61 pages
Implementation of Artificial Intelligence and Its Impact On Human Auditors - Finance International Program
No ratings yet
Implementation of Artificial Intelligence and Its Impact On Human Auditors - Finance International Program
7 pages
Unit - 1 Deep Learning Techniques
No ratings yet
Unit - 1 Deep Learning Techniques
18 pages
SIH Problem Statements (55-80)
No ratings yet
SIH Problem Statements (55-80)
5 pages
Dissertation 4k0Xkrx
No ratings yet
Dissertation 4k0Xkrx
322 pages
Calculationo Marks&ConversionCertificateforVarious 1 2
No ratings yet
Calculationo Marks&ConversionCertificateforVarious 1 2
2 pages
Unit 1
No ratings yet
Unit 1
31 pages
Margadarsika 06470998ddbe4d9 41955371
No ratings yet
Margadarsika 06470998ddbe4d9 41955371
60 pages
Signalsyll
No ratings yet
Signalsyll
107 pages
Research Article: Multivariate Streamflow Simulation Using Hybrid Deep Learning Models
No ratings yet
Research Article: Multivariate Streamflow Simulation Using Hybrid Deep Learning Models
16 pages
3DDGD 3D Deepfake Generation and Detection Using 3D Face Meshes
No ratings yet
3DDGD 3D Deepfake Generation and Detection Using 3D Face Meshes
13 pages
C4GT Community - Proposal - 1
No ratings yet
C4GT Community - Proposal - 1
7 pages
Codathon Questions
No ratings yet
Codathon Questions
11 pages
Lecture 5 Emerging Technology
No ratings yet
Lecture 5 Emerging Technology
20 pages
Unit 4 SPM
No ratings yet
Unit 4 SPM
8 pages
Internal Evaluation - IT762 Dissertation, 04316404523
No ratings yet
Internal Evaluation - IT762 Dissertation, 04316404523
37 pages
Unit 1
No ratings yet
Unit 1
21 pages
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
No ratings yet
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
24 pages
1 s2.0 S2772970223000263 Main
No ratings yet
1 s2.0 S2772970223000263 Main
15 pages
Deep Learning Techniques For Visual SLAM A Survey
No ratings yet
Deep Learning Techniques For Visual SLAM A Survey
25 pages
Lec 01
No ratings yet
Lec 01
76 pages
Blockchain Technology Spectrum A Gartner Trend Insight Report
No ratings yet
Blockchain Technology Spectrum A Gartner Trend Insight Report
13 pages
Using Software Measurement in SLAs
No ratings yet
Using Software Measurement in SLAs
11 pages
Hexaview - USICT - 2025 Time Slot - Online Assessment
No ratings yet
Hexaview - USICT - 2025 Time Slot - Online Assessment
3 pages
VDPLTechnical Paperfor Whats App AIChatbots
No ratings yet
VDPLTechnical Paperfor Whats App AIChatbots
8 pages
Blood Cancer Detection Research Paper Using CNN
No ratings yet
Blood Cancer Detection Research Paper Using CNN
11 pages
Admit Card
No ratings yet
Admit Card
1 page
Instructions To Access Delhi Buses - GTFS-RT Feed - DTC & DIMTS
No ratings yet
Instructions To Access Delhi Buses - GTFS-RT Feed - DTC & DIMTS
3 pages
Data Types
No ratings yet
Data Types
2 pages
Review of Machine Learning Models For WinLoss Prediction in League of Legends
No ratings yet
Review of Machine Learning Models For WinLoss Prediction in League of Legends
7 pages
Perplexity
No ratings yet
Perplexity
2 pages
Updated Crop Recommendation
No ratings yet
Updated Crop Recommendation
27 pages
Image Recognition of Citrus Diseases Based On Deep Learning: Ech T Press Science
No ratings yet
Image Recognition of Citrus Diseases Based On Deep Learning: Ech T Press Science
10 pages
可解释的人工智能：理解、可视化和解释深度学习模型
No ratings yet
可解释的人工智能：理解、可视化和解释深度学习模型
11 pages
Complete Project Cost Estimation
No ratings yet
Complete Project Cost Estimation
1 page
Front Page Gold Text
No ratings yet
Front Page Gold Text
1 page
1 PB
No ratings yet
1 PB
14 pages
Heart Sound
No ratings yet
Heart Sound
8 pages
Guidance For Evaluation of AI Assisted Systems
No ratings yet
Guidance For Evaluation of AI Assisted Systems
21 pages
Adalya
No ratings yet
Adalya
10 pages
Bece309l Artificial-Intelligence-And-machine-learning TH 1.0 71 Bece309l 66 Acp
No ratings yet
Bece309l Artificial-Intelligence-And-machine-learning TH 1.0 71 Bece309l 66 Acp
2 pages
Deep Learning With Coherent Nanophotonic Circuits: Articles
No ratings yet
Deep Learning With Coherent Nanophotonic Circuits: Articles
7 pages
Computer Science 2021 22
No ratings yet
Computer Science 2021 22
8 pages
Developing A Real-Time Gun Detection Classifier
No ratings yet
Developing A Real-Time Gun Detection Classifier
4 pages
DH Ipc Hdbw5541e Z5e Datasheet 20190620
No ratings yet
DH Ipc Hdbw5541e Z5e Datasheet 20190620
3 pages
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
From Everand
Cloud Computing: Master the Concepts, Architecture and Applications with Real-world examples and Case studies
Ruchi Doshi
No ratings yet
Cookbook for Mobile Robotic Platform Control: With Internet of Things And Ti Launch Pad
From Everand
Cookbook for Mobile Robotic Platform Control: With Internet of Things And Ti Launch Pad
Dr. Anita Gehlot
No ratings yet
Industrial Automation: Learn the current and leading-edge research on SCADA security
From Everand
Industrial Automation: Learn the current and leading-edge research on SCADA security
Vikalp Joshi
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sahil Pahuja End Term Seminar Report 1

Uploaded by

Sahil Pahuja End Term Seminar Report 1

Uploaded by

AI-Driven Multimodal Journey Planning and Document

Intelligence: An Industry Internship Experience

MASTER OF COMPUTER APPLICATIONS

UNIVERSITY SCHOOL OF INFORMATION, COMMUNICATION &

I hereby further certify that:

Name: Sahil Pahuja

USICT, GGSIPU, New Delhi

Col (Dr) Amit Oberoi​

Core Python and Its Versatility

Deep Learning and Machine Learning Frameworks

Specialized Libraries for Speech and Audio Processing

●​ Librosa, PyDub, and SciPy:

Integration and Application

Role and Benefits of SQL

Database Design and Query Optimization

3.​Git & GitHub

Git – The Distributed Version Control System

GitHub – Cloud-Based Collaboration Platform

Integration in the Project Workflow

Collaborative Best Practices

4.​Large Language Models (LLMs)

Key Characteristics and Applications

Fine-Tuning for Domain-Specific Use

To enhance performance in specific domains, LLMs can be fine-tuned on specialized datasets.

5.​Retrieval-Augmented Generation (RAG)

Concept and Functionality

Benefits and Use Cases

Concept and Functionality

LangChain is a powerful open-source framework designed to simplify the development of

Modular Design and Components

LangChain’s architecture is based on the concept of “chains”—pipelines composed of various

Integration in the Journey Planning System

●​ Enhanced Real-Time Data Integration:

●​ Improved Speech Diarization Models:

●​ Multilingual Speech Processing:

●​ Predictive Analytics for Urban Mobility:

●​ Automated Error Correction in Speech Transcription:

●​ Integration with IoT Devices:

The internship provided a unique opportunity to integrate a diverse range of technologies—from

Retrieval-augmented Generation | https://research.ibm.com/blog/retrieval-augmented-generation-RA/

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Col (Dr) Amit Oberoi

● Librosa, PyDub, and SciPy:

3.Git & GitHub

4.Large Language Models (LLMs)

5.Retrieval-Augmented Generation (RAG)

● Enhanced Real-Time Data Integration:

● Improved Speech Diarization Models:

● Multilingual Speech Processing:

● Predictive Analytics for Urban Mobility:

● Automated Error Correction in Speech Transcription:

● Integration with IoT Devices: