0% found this document useful (0 votes)

36 views8 pages

Slides On Speech To Text Model

The project aims to enhance automatic speech recognition (ASR) for the Pashto language by fine-tuning the Whisper model with domain-specific vocabulary to achieve a Word Error Rate (WER) below 10%. It involves collecting and annotating over 20 hours of speech data and is aligned with various Sustainable Development Goals. The methodology includes diverse data collection, model training, evaluation, and the potential for scalability to other regional languages.

Uploaded by

Afaq Ali Nagra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views8 pages

Slides On Speech To Text Model

Uploaded by

Afaq Ali Nagra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Name of Project 1

Enhancing Automatic Speech

Recognition for low resource
Languages using SOTA Model
(Whisper/Wav2Vec)

Supervised by: dr. Shibli nisar.

Syndicate leader: zarnab hassan malik

2
Problem Definition

This project aims to improve speech-to-text (STT) accuracy for the Pashto language by fine-tuning
the Whisper ASR model with additional vocabulary relevant to agriculture, health, banking, food,
and services. The project will focus on reducing Word Error Rate (WER) below 10% through data
collection, annotation, and model optimization. The expected outcome is a more robust
ASR system tailored to regional dialects and specialized vocabulary.
Relevant Sustainable Development Goals (SDGs) 3

• Industry, innovation and infrastructure.

• Linguistic marganalization and digital exclusion.
• Inclusive digital transformation.
• Reducing inequalities.
Objective of Project 4

Objective
• Collect and annotate 20+ hours of Pashto speech data.
• Fine-tune Whisper ASR with domain-specific vocabulary.
• Reduce Pashto ASR Word Error Rate (WER) below 10%.
• Build a scalable model for other regional languages.
Scope of Project 5

Scope
• Focus on Pashto STT improvement using Whisper ASR.
• Include diverse dialects and domain-specific terms.
• Perform model training, evaluation, and benchmarking.
• Extendable to other regional languages based on results.
Proposed Methodology 6

Proposed Methodology
• Collect Pashto audio from diverse speakers and dialects.
• Transcribe and annotate speech data with accuracy.
• Fine-tune Whisper ASR using the annotated dataset.
• Evaluate performance using WER and optimize results.
• Build a scalable pipeline.
Skill Set Involved 7

Resources Involved/Skill Set

Hardware: High-performance GPUs for training,

audio recording equipment.
Software: Python, PyTorch, Hugging Face
Transformers, Whisper ASR framework
Project Timeline 8

• Gantt Chart for 1 Year with Deliverables

Task Duration Deliverable
Literature Review Month 1 Report on existing ASR models

Data Collection Months 2–4 20+ hours of recorded audio

Data Annotation Months 5–6 Fully annotated dataset

Model Fine-Tuning Months 7–9 Fine-tuned Whisper ASR model

Model Evaluation Month 10 WER analysis report

Deployment & Testing Month 11 Finalized STT system

Documentation & Report Writing Month 12 Final project

224s 22 Lec7
No ratings yet
224s 22 Lec7
50 pages
(WN) ReZero - Arc 7 (Dark)
No ratings yet
(WN) ReZero - Arc 7 (Dark)
2,953 pages
Speech Recognition
No ratings yet
Speech Recognition
9 pages
Ai Voice Assistant PPT Project
0% (1)
Ai Voice Assistant PPT Project
22 pages
Iris Virtual Assistant Project
No ratings yet
Iris Virtual Assistant Project
17 pages
W - LM: I Asr M L M L - R L: Hisper Mproving Odels With Anguage Odels For OW Esource Anguages
No ratings yet
W - LM: I Asr M L M L - R L: Hisper Mproving Odels With Anguage Odels For OW Esource Anguages
26 pages
Voice To Text Project Report
No ratings yet
Voice To Text Project Report
3 pages
Enhancing Bangla Local Speech-To-Text Conversion Using Fine-Tuning Wav2Vec 2.0 With Openslr and Self-Compiled Datasets Through Transfer Learning
No ratings yet
Enhancing Bangla Local Speech-To-Text Conversion Using Fine-Tuning Wav2Vec 2.0 With Openslr and Self-Compiled Datasets Through Transfer Learning
11 pages
WTW Main
No ratings yet
WTW Main
75 pages
GLOB Voice Assistant
No ratings yet
GLOB Voice Assistant
6 pages
Amharic ASR Project Proposal
No ratings yet
Amharic ASR Project Proposal
7 pages
Comparative Analysis of State-of-the-Art Speech Recognition Models For Low-Resource Marathi Language
No ratings yet
Comparative Analysis of State-of-the-Art Speech Recognition Models For Low-Resource Marathi Language
2 pages
WhisperMini Scope
No ratings yet
WhisperMini Scope
14 pages
Speech Recognition Techniques - GUVI
No ratings yet
Speech Recognition Techniques - GUVI
4 pages
1 s2.0 S0957417424009850 Main
No ratings yet
1 s2.0 S0957417424009850 Main
11 pages
7sem Projectreport
No ratings yet
7sem Projectreport
33 pages
2208.12666v1 Feature Extraction
No ratings yet
2208.12666v1 Feature Extraction
13 pages
Unit 3 NMU
No ratings yet
Unit 3 NMU
4 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
1nd Progress Presentation 2023 AI 1 Update
No ratings yet
1nd Progress Presentation 2023 AI 1 Update
15 pages
Adobe Scan 18 Mar 2025
No ratings yet
Adobe Scan 18 Mar 2025
3 pages
Speech Recognition System Using Python Report
No ratings yet
Speech Recognition System Using Python Report
7 pages
Synopsis Project Phase 1
No ratings yet
Synopsis Project Phase 1
5 pages
Project PPT Presentation Template-1
No ratings yet
Project PPT Presentation Template-1
16 pages
CSP - Final Project - 23L8005,23L8037
No ratings yet
CSP - Final Project - 23L8005,23L8037
6 pages
Format Edit
No ratings yet
Format Edit
10 pages
Learn Hindi in Kannada
No ratings yet
Learn Hindi in Kannada
240 pages
4032 Whispering LLaMA A Cross
No ratings yet
4032 Whispering LLaMA A Cross
10 pages
Py Report
No ratings yet
Py Report
8 pages
NLP Chatbot AI
No ratings yet
NLP Chatbot AI
8 pages
Reasechpaperon LLM
No ratings yet
Reasechpaperon LLM
25 pages
Presentation ML
No ratings yet
Presentation ML
9 pages
Proposal PhamThaiNguyen 22560053
No ratings yet
Proposal PhamThaiNguyen 22560053
11 pages
Urdu Text To Speech API
No ratings yet
Urdu Text To Speech API
15 pages
ASR Survey Presentation
No ratings yet
ASR Survey Presentation
14 pages
DENOASR
No ratings yet
DENOASR
13 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
4 pages
Reasechpaperon LLM
No ratings yet
Reasechpaperon LLM
25 pages
Case Study: Speech Recognition For Virtual Assistants: 1. Problem Identification
No ratings yet
Case Study: Speech Recognition For Virtual Assistants: 1. Problem Identification
8 pages
Wa0002.
No ratings yet
Wa0002.
10 pages
Python Project Report
No ratings yet
Python Project Report
4 pages
Unit 2 NMU
No ratings yet
Unit 2 NMU
4 pages
ASR For Samskruta
No ratings yet
ASR For Samskruta
8 pages
Synopsis
No ratings yet
Synopsis
6 pages
Python Report
No ratings yet
Python Report
6 pages
Minor Project Sem 2
No ratings yet
Minor Project Sem 2
35 pages
Filipino 10 LAS Q4
73% (11)
Filipino 10 LAS Q4
206 pages
Project Proposal: Project Title: Speech To Text Conversion Problem Statement
No ratings yet
Project Proposal: Project Title: Speech To Text Conversion Problem Statement
2 pages
SYnopsis
No ratings yet
SYnopsis
5 pages
Jarvis
No ratings yet
Jarvis
12 pages
DL Proj Rep
No ratings yet
DL Proj Rep
11 pages
Sip Project
No ratings yet
Sip Project
7 pages
GRAMMAR 1 Worksheets
100% (1)
GRAMMAR 1 Worksheets
397 pages
Unit 1 NMU
No ratings yet
Unit 1 NMU
4 pages
Netaji Subhas Institute of Technology, Bihta, Patna
No ratings yet
Netaji Subhas Institute of Technology, Bihta, Patna
12 pages
CPP Project Report
No ratings yet
CPP Project Report
15 pages
CPP Final Report
No ratings yet
CPP Final Report
16 pages
Komplit PPT Micro Teaching Procedure Text by Silvy Permatasari Pbi Kelas Pagi (A01.020.005)
No ratings yet
Komplit PPT Micro Teaching Procedure Text by Silvy Permatasari Pbi Kelas Pagi (A01.020.005)
27 pages
UNIT 4. ASEAN and Viet Nam - KEY
No ratings yet
UNIT 4. ASEAN and Viet Nam - KEY
11 pages
Lopez Alan - SPEAKOUT Starter Achievement Test 1 First Part
100% (1)
Lopez Alan - SPEAKOUT Starter Achievement Test 1 First Part
4 pages
Expert System Voice Assistant
No ratings yet
Expert System Voice Assistant
52 pages
3-TOEFL-Reading Skills
No ratings yet
3-TOEFL-Reading Skills
40 pages
Kieler 1974 UrbanizationinSouthIndia SouthwestConferenceonAsianStudies
No ratings yet
Kieler 1974 UrbanizationinSouthIndia SouthwestConferenceonAsianStudies
11 pages
Research Paper
No ratings yet
Research Paper
10 pages
Oel - Report CCN
No ratings yet
Oel - Report CCN
10 pages
Digital Signal Processing
No ratings yet
Digital Signal Processing
9 pages
Human Rights
No ratings yet
Human Rights
3 pages
CBSE NCERT Class 4 English Grammar Chapter 14 Prepositions in PDF 4
No ratings yet
CBSE NCERT Class 4 English Grammar Chapter 14 Prepositions in PDF 4
2 pages
Approved by AICTE, New Delhi Affiliated To Aryabhatta Knowledge University, Patna, BIHAR
No ratings yet
Approved by AICTE, New Delhi Affiliated To Aryabhatta Knowledge University, Patna, BIHAR
5 pages
ALREADY-YET-JUST Ex
No ratings yet
ALREADY-YET-JUST Ex
6 pages
Format R1
No ratings yet
Format R1
5 pages
Sharifudin, Nanang. 2019. Students' Difficuties in Translating Explanation Text From English To Indonesian.
No ratings yet
Sharifudin, Nanang. 2019. Students' Difficuties in Translating Explanation Text From English To Indonesian.
58 pages
100 Words (Part1)
No ratings yet
100 Words (Part1)
30 pages
Class 8th Test - 2 GK
No ratings yet
Class 8th Test - 2 GK
4 pages
06 Vocabulary Proficiency
No ratings yet
06 Vocabulary Proficiency
3 pages
Module-JLZ 110
No ratings yet
Module-JLZ 110
2 pages
Grade 3LEAST MASTERED PLAZA PRIMARY SCHOOL
No ratings yet
Grade 3LEAST MASTERED PLAZA PRIMARY SCHOOL
2 pages
Summer Booklet PDF
No ratings yet
Summer Booklet PDF
10 pages
On Communication Skills
No ratings yet
On Communication Skills
14 pages
Adverb, Adjective & Degree of Comparison
No ratings yet
Adverb, Adjective & Degree of Comparison
4 pages
Informal Letter
No ratings yet
Informal Letter
7 pages
1assignment 30 56 2018notes Class 11 (Eng) Oct. 2018
No ratings yet
1assignment 30 56 2018notes Class 11 (Eng) Oct. 2018
16 pages
Eng 12 - Unit 4 - Language Focus
No ratings yet
Eng 12 - Unit 4 - Language Focus
9 pages
List of Commonly Used Irregular Verbs
No ratings yet
List of Commonly Used Irregular Verbs
4 pages
(123doc) - De-Thi-Hoc-Sinh-Gioi-Tieng-Anh-9-Huyen-Tam-Duong-Nam-Hoc-2010-2011-Co-Dap-An
No ratings yet
(123doc) - De-Thi-Hoc-Sinh-Gioi-Tieng-Anh-9-Huyen-Tam-Duong-Nam-Hoc-2010-2011-Co-Dap-An
5 pages
Lesson Plan Clasa A2a Ruth Fazecas
No ratings yet
Lesson Plan Clasa A2a Ruth Fazecas
4 pages
Guia de Estudio #1
No ratings yet
Guia de Estudio #1
14 pages
My Schedule
No ratings yet
My Schedule
1 page
ODF2 Unit 7 Grammar Practice
No ratings yet
ODF2 Unit 7 Grammar Practice
2 pages
OpenAI Whisper for Developers: The Complete Guide for Developers and Engineers
From Everand
OpenAI Whisper for Developers: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Practical Kaldi for Speech Recognition: The Complete Guide for Developers and Engineers
From Everand
Practical Kaldi for Speech Recognition: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
SpaCy for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
SpaCy for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Speech-to-Text Systems and Technologies: Definitive Reference for Developers and Engineers
From Everand
Speech-to-Text Systems and Technologies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
From Everand
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Slides On Speech To Text Model

Uploaded by

Slides On Speech To Text Model

Uploaded by

Name of Project 1

Enhancing Automatic Speech

Supervised by: dr. Shibli nisar.

Syndicate leader: zarnab hassan malik

• Industry, innovation and infrastructure.

Resources Involved/Skill Set

Hardware: High-performance GPUs for training,

• Gantt Chart for 1 Year with Deliverables

Data Collection Months 2–4 20+ hours of recorded audio

Data Annotation Months 5–6 Fully annotated dataset

Model Fine-Tuning Months 7–9 Fine-tuned Whisper ASR model

Model Evaluation Month 10 WER analysis report

Deployment & Testing Month 11 Finalized STT system

Documentation & Report Writing Month 12 Final project

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.