0% found this document useful (0 votes)

13 views23 pages

Session 14 - Computaional Linguistics

Uploaded by

ebonchill7.0.0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views23 pages

Session 14 - Computaional Linguistics

Uploaded by

ebonchill7.0.0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Applied Linguistics II

SEYED MOHAMMAD HOSSEINI

D E P T. O F E N G L I S H
ARAK UNIVERSITY
Session 14
oComputational Linguistics
oCorpus Linguistics
Computational Linguistics
Definition
Goals
Methods
Topics in computational linguistics
Computational linguistics: Definition
“A branch of linguistics in which computational techniques
and concepts are applied to the elucidation of linguistic and
phonetic problems. Several research areas have developed,
including natural language processing, speech synthesis,
speech recognition, automatic translation, the making of
concordances, the testing of grammars, and the many areas
where statistical counts and analyses are required (e.g. in
literary textual studies).”
Crystal, D. 2008. A Dictionary of Linguistics and Phonetics. Blackwell Publishing.
Computational linguistics: Definition
◦ “Computational linguistics is the scientific and engineering discipline
concerned with understanding written and spoken language from a
computational perspective, and building artifacts that usefully process and
produce language, either in bulk or in a dialogue setting.
◦ To the extent that language is a mirror of mind, a computational
understanding of language also provides insight into thinking and intelligence.
◦ And since language is our most natural and most versatile means of
communication, linguistically competent computers would greatly facilitate
our interaction with machines and software of all sorts, and put at our
fingertips, in ways that truly meet our needs, the vast textual and other
resources of the internet.”
Theoretical goals of computational linguistics

The theoretical goals of computational linguistics include:

◦ formulation of grammatical and semantic frameworks for characterizing

languages;

◦ the discovery of processing techniques and learning principles;

◦ the development of cognitively and neuroscientifically plausible

computational models of how language processing and learning might
occur in the brain.
Practical goals of computational
linguistics
The practical goals of the field are broad and varied. Some of the most prominent
are:
◦ efficient text retrieval on some desired topic;
◦ effective machine translation (MT);
◦ question answering (QA);
◦ text summarization;
◦ analysis of texts or spoken language for topic, sentiment, or other psychological
attributes;
◦ dialogue agents for accomplishing particular tasks (purchases, technical trouble
shooting, trip planning, schedule maintenance, medical advising, etc.);
◦ creation of computational systems with human-like competency in dialogue, in
acquiring language, and in gaining knowledge from text.
Methods
The methods employed in theoretical and practical research
in computational linguistics have often drawn upon theories
and findings in
◦ theoretical linguistics
◦ philosophical logic
◦ cognitive science (especially psycholinguistics)
◦ computer science.
Topics in Computational Linguistics:
Syntax and parsing
◦ The structural hierarchy
◦ Syntax
◦ Parsing
◦ Coping with syntactic ambiguity
Topics in Computational Linguistics:
Semantic representation
◦ Relating language to logic
◦ Thematic/case roles
◦ Expressivity issues
◦ Mapping syntactic trees to logical forms
◦ Coping with semantic ambiguity
Topics in Computational Linguistics:
Making sense of text
◦ Making sense of text
◦ Dealing with reference and various forms of “missing material”
◦ Making connections
◦ Dealing with figurative language
◦ Making sense of, and engaging in, dialogue
Topics in Computational Linguistics:
Acquiring knowledge for language
◦ Acquiring knowledge for language
◦ Knowledge extraction from text
◦ Crowdsourcing (soliciting verbally expressed information, or
annotations of such information, from large numbers of web
users)
Applications
◦ Machine translation
◦ Document retrieval and clustering applications
◦ Knowledge extraction and summarization
◦ Sentiment analysis
◦ Chatbots and companionable dialogue agents
◦ Virtual worlds, games, and interactive fiction
◦ Natural language user interfaces
◦ Text-based question answering
◦ Inferential (knowledge-based) question answering
◦ Voice-based web services and assistants
◦ Collaborative problem solvers and intelligent tutors
◦ Language-enabled robots
Source
https://plato.stanford.edu/entries/computational-linguistics/#DatFroEnd
Corpus (pl. corpora or corpuses)
a collection of naturally occurring samples of language which have
been collected and collated for easy access by researchers and
materials developers who want to know how words and other
linguistic items are actually used. A corpus may vary from a few
sentences to a set of written texts or recordings. In language analysis
corpuses usually consist of a relatively large, planned collection of
texts or parts of texts, stored and accessed by computer.
◦ Richards, J. C. and R. Schmidt, 2010. Longman Dictionary of Language
Teaching and Applied Linguistics. Longman.
Corpus
A collection of linguistic data, either written texts or a transcription of recorded
speech, which can be used as a starting point of linguistic description or as a
means of verifying hypotheses about a language.
Corpora provide the basis for one kind of computational linguistics. A computer
corpus is a large body of machine-readable texts. Increasingly large corpora
(especially of English) have been compiled since the 1980s, and are used both in
the development of natural language processing software and in such
applications as lexicography, speech recognition, and machine translation.
◦ Crystal, D. 2008. A Dictionary of Linguistics and Phonetics. Blackwell Publishing.
Types of Corpora
A corpus is designed to represent different types of language use, e.g. casual conversation,
business letters, ESP texts. A number of different types of corpuses may be distinguished, for
example:
1 specialized corpus: a corpus of texts of a particular type, such as academic articles, student
writing, etc.
2 general corpus or reference corpus: a large collection of many different types of texts, often
used to produce reference materials for language learning (e.g. dictionaries) or used as a base-
line for comparison with specialized corpora
3 comparable corpora: two or more corpora in different languages or language varieties
containing the same kinds and amounts of texts, to enable differences or equivalences to be
compared
4 learner corpus: a collection of texts or language samples produced by language learners
◦ Richards, J. C. and R. Schmidt, 2010. Longman Dictionary of Language Teaching and Applied Linguistics.
Corpus Linguistics
“an approach to investigating language structure and use
through the analysis of large databases of real language
examples stored on computer.”
◦ Richards, J. C. and R. Schmidt, 2010. Longman Dictionary of
Language Teaching and Applied Linguistics.
How can corpora help?
Issues amenable to corpus linguistics include
◦ the meanings of words across registers,
◦ the distribution and function of grammatical forms and categories,
◦ the investigation of lexico-grammatical associations (associations
of specific words with particular grammatical constructions),
◦ the study of discourse characteristics, register variation, and
◦ (when learner corpora are available) issues in language acquisition
and development.”
Why corpora?
• Objective verification of results
• Corpora show how people really use the language. They do not provide imaginary, idealized
examples
• Quantitative data shows what occurs frequently and what occurs rarely in the language
• Thanks to IT-technology we can conduct fast, complex studies, process more material than by
hand
Criticism
Linguistic descriptions which are ‘corpus restricted’ have been the subject of
criticism, especially by generative grammarians, who point to the limitations of
corpora (e.g. that they are samples of performance only, and that one still needs
a means of projecting beyond the corpus to the language as a whole).
In fieldwork on a new language, or in historical study, it may be very difficult to
get beyond one’s corpus (i.e. it is a ‘closed’ as opposed to an ‘extendable’
corpus), but in languages where linguists have regular access to native-speakers
(and may be native-speakers themselves) their approach will invariably be
‘corpus-based’, rather than corpus-restricted.
◦ Crystal, D. 2008. A Dictionary of Linguistics and Phonetics. Blackwell Publishing.
Some English Corpora
Corpus of Contemporary American English (COCA): The corpus is composed of more than 1 billion
words from 220,225 texts, including 20 million words from each of the years 1990 through 2017.
https://www.english-corpora.org/coca/
The Corpus of Historical American English (COHA) is the largest structured corpus of historical
English. COHA contains more than 475 million words of text from the 1820s-2010s and the corpus
is balanced by genre decade by decade.
https://www.english-corpora.org/coha/
British National Corpus (BNC): The British National Corpus (BNC) was originally created by Oxford
University press in the 1980s - early 1990s, and it contains 100 million words of texts from a wide
range of genres (e.g. spoken, fiction, magazines, newspapers, and academic).
https://www.english-corpora.org/bnc/
Some Persian Corpora
‫دکتر بیجن خان‬-‫پیکره متنی زبان فارسی‬
https://www.peykaregan.ir/dataset/%D9%BE%DB%8C%DA%A9%D8%B1%D9%87-%D9%85%D8
%AA%D9%86%DB%8C-%D8%B2%D8%A8%D8%A7%D9%86-%D9%81%D8%A7%D8%B1%D8%B3
%DB%8C
‫دکتر مصطفی عاصی‬-‫پایگاه دادگان زبان فارسی‬
http://pldb.ihcs.ac.ir/

‫فرهنگستان‬
https://dadegan.apll.ir/

Corpora in Applied Linguistics (Susan Hunston) (Z-Library)
100% (4)
Corpora in Applied Linguistics (Susan Hunston) (Z-Library)
361 pages
English Corpus Linguistics An Introduction 1st Edition Charles F. Meyer Instant Download
100% (3)
English Corpus Linguistics An Introduction 1st Edition Charles F. Meyer Instant Download
78 pages
Utility and Application of Language Corp
No ratings yet
Utility and Application of Language Corp
308 pages
The Power of Words
100% (3)
The Power of Words
273 pages
2.3 Introduction To Corpora and Corpora Analysis
No ratings yet
2.3 Introduction To Corpora and Corpora Analysis
42 pages
Corpus Linguistics 1
No ratings yet
Corpus Linguistics 1
48 pages
Tony McEnery - Andrew Wilson - Corpus Linguistics-Edinburgh University Press (2022)
No ratings yet
Tony McEnery - Andrew Wilson - Corpus Linguistics-Edinburgh University Press (2022)
248 pages
Aijmer & Altenberg - Advances in Corpus Linguistics
100% (1)
Aijmer & Altenberg - Advances in Corpus Linguistics
395 pages
Corpus Methods in Linguistics
No ratings yet
Corpus Methods in Linguistics
19 pages
(Claudia Claridge) Hyperbole in English A Corpus PDF
No ratings yet
(Claudia Claridge) Hyperbole in English A Corpus PDF
316 pages
Pages From English Corpus Linguistics, An Introduction, 2 Ed., Charles Meyers, CUP 2023
No ratings yet
Pages From English Corpus Linguistics, An Introduction, 2 Ed., Charles Meyers, CUP 2023
41 pages
English Morphology - Leksione
No ratings yet
English Morphology - Leksione
133 pages
Corpus Bases Language Studies
No ratings yet
Corpus Bases Language Studies
312 pages
NLP Module1-4
No ratings yet
NLP Module1-4
100 pages
Разместите заголовок здесь
No ratings yet
Разместите заголовок здесь
12 pages
(Oxford Handbooks in Linguistics) Ruslan Mitkov (Ed.) - The Oxford Handbook of Computational Linguistics-Oxford University Press, USA (2003)
86% (7)
(Oxford Handbooks in Linguistics) Ruslan Mitkov (Ed.) - The Oxford Handbook of Computational Linguistics-Oxford University Press, USA (2003)
1,231 pages
Unit 1 2 3 4 5 NLP Notes Merged
100% (1)
Unit 1 2 3 4 5 NLP Notes Merged
105 pages
Corpus 1
No ratings yet
Corpus 1
41 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
40 pages
Discourse On The Move
100% (1)
Discourse On The Move
304 pages
The Basics of Corpus Linguistics: An Introduction For Beginners
No ratings yet
The Basics of Corpus Linguistics: An Introduction For Beginners
16 pages
1shakarim University of Semey City - PDF - Compressed
No ratings yet
1shakarim University of Semey City - PDF - Compressed
10 pages
MIchael McCarthy Publications
No ratings yet
MIchael McCarthy Publications
6 pages
1.? Introduction To Corpora and Corpus Linguistics. General Introduction
No ratings yet
1.? Introduction To Corpora and Corpus Linguistics. General Introduction
33 pages
Slides Computational Linguistics
No ratings yet
Slides Computational Linguistics
31 pages
Corpus Into, Evo, Types, Spoken
No ratings yet
Corpus Into, Evo, Types, Spoken
32 pages
1 Corpus Linguistics
No ratings yet
1 Corpus Linguistics
38 pages
Corpus Methods in Language Studies
No ratings yet
Corpus Methods in Language Studies
20 pages
Semantic Prosody
No ratings yet
Semantic Prosody
14 pages
Language, Linguistics, and Development Simplified
From Everand
Language, Linguistics, and Development Simplified
Narinder Mehra
No ratings yet
Corpus
No ratings yet
Corpus
123 pages
Introduction To Natural Language Processing: Unit 1
No ratings yet
Introduction To Natural Language Processing: Unit 1
60 pages
Basic Concepts
No ratings yet
Basic Concepts
6 pages
Qualitative Corpus Analysis
No ratings yet
Qualitative Corpus Analysis
11 pages
Corpus Introduction-Chap 1
No ratings yet
Corpus Introduction-Chap 1
17 pages
Appiled Linguistics Corpus Linguistics
No ratings yet
Appiled Linguistics Corpus Linguistics
16 pages
Critical Book Review Language Studies by Dhetasa Younetentia Telaumbanua
No ratings yet
Critical Book Review Language Studies by Dhetasa Younetentia Telaumbanua
23 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
31 pages
Definition and Features of A Corpus
No ratings yet
Definition and Features of A Corpus
23 pages
Dissertation Iran
100% (2)
Dissertation Iran
7 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
9 pages
Taught Ma Courses Handbook
No ratings yet
Taught Ma Courses Handbook
73 pages
History of Corpus Linguistics Lect#2
No ratings yet
History of Corpus Linguistics Lect#2
9 pages
Linguistics Summary
No ratings yet
Linguistics Summary
3 pages
Huang 2015
No ratings yet
Huang 2015
5 pages
შუალედური - ლექსიკოგრაფია
No ratings yet
შუალედური - ლექსიკოგრაფია
8 pages
Aula2 - 2003 KilgGrefenstette WACIntro PDF
No ratings yet
Aula2 - 2003 KilgGrefenstette WACIntro PDF
15 pages
Computational
No ratings yet
Computational
22 pages
LECTURE Functionalism Applied Linguistic Corpora Linguistics 1
No ratings yet
LECTURE Functionalism Applied Linguistic Corpora Linguistics 1
7 pages
The Routledge Handbook of Lexicography Pedro A Fuertesolivera PDF Download
No ratings yet
The Routledge Handbook of Lexicography Pedro A Fuertesolivera PDF Download
88 pages
An Introduction To Corpus Linguistics
100% (1)
An Introduction To Corpus Linguistics
328 pages
Selected Bibligraphy For Sociolinguistics
No ratings yet
Selected Bibligraphy For Sociolinguistics
81 pages
The International Encyclopedia of Language and Social Interaction - 2015 - Vaughan
No ratings yet
The International Encyclopedia of Language and Social Interaction - 2015 - Vaughan
17 pages
Corpus Typology
No ratings yet
Corpus Typology
23 pages
Seminar 1
No ratings yet
Seminar 1
7 pages
Kilgarriff and Grefenstette - 2003 - Introduction To The Special Issue On The Web As Co
No ratings yet
Kilgarriff and Grefenstette - 2003 - Introduction To The Special Issue On The Web As Co
15 pages
Herrero Morales Alba
No ratings yet
Herrero Morales Alba
24 pages
Muslim Speak - Ablution or Wudu
No ratings yet
Muslim Speak - Ablution or Wudu
26 pages
Group Members:: Ayesha Azhar Bareera Akbar Irum Masood Maryam Ahmed Tahira Jabeen
No ratings yet
Group Members:: Ayesha Azhar Bareera Akbar Irum Masood Maryam Ahmed Tahira Jabeen
58 pages
7
No ratings yet
7
4 pages
The Role of Corpus Linguistics in Grammar Instruct
No ratings yet
The Role of Corpus Linguistics in Grammar Instruct
10 pages
The Routledge Handbook of Translation An
No ratings yet
The Routledge Handbook of Translation An
6 pages
CORPUS TYPES and CRITERIA
100% (2)
CORPUS TYPES and CRITERIA
14 pages
240 Paper
No ratings yet
240 Paper
6 pages
Hapax Legomenon - Wikipedia
No ratings yet
Hapax Legomenon - Wikipedia
6 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
25 pages
Lecture 1: Introduction To NLP: Understand Concepts Applications
No ratings yet
Lecture 1: Introduction To NLP: Understand Concepts Applications
32 pages
Project Proposal
No ratings yet
Project Proposal
6 pages
SEQUENCE 2pppp
No ratings yet
SEQUENCE 2pppp
3 pages
Sociolinguistics and Corpse Linguistics
No ratings yet
Sociolinguistics and Corpse Linguistics
9 pages
GUSTILOETAL2018 MOVEANALYSISuploadedinresearchgate
No ratings yet
GUSTILOETAL2018 MOVEANALYSISuploadedinresearchgate
26 pages
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond
From Everand
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond
Georgette Nicolas Jabbour
No ratings yet
Séquence 4 NEW PPDDFF
No ratings yet
Séquence 4 NEW PPDDFF
6 pages
Natural Language Processing State of The Art Curre
No ratings yet
Natural Language Processing State of The Art Curre
26 pages
Cheng 2012 PP 3-8 Intro
No ratings yet
Cheng 2012 PP 3-8 Intro
6 pages
WK 3 Key Issues For Corpora Selection
No ratings yet
WK 3 Key Issues For Corpora Selection
37 pages
Corpora in Indian Languages
No ratings yet
Corpora in Indian Languages
18 pages
Corpus Linguistics: An Introduction
No ratings yet
Corpus Linguistics: An Introduction
43 pages
Trends in English Language Teaching Toda
No ratings yet
Trends in English Language Teaching Toda
8 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
23 pages
Advances in Natural Language Processing
No ratings yet
Advances in Natural Language Processing
7 pages
Como Apresentar A Pesquisa Linguística
No ratings yet
Como Apresentar A Pesquisa Linguística
19 pages
Introduction To Computational Linguistics: CS 5890 University of Colorado at Colorado Springs
No ratings yet
Introduction To Computational Linguistics: CS 5890 University of Colorado at Colorado Springs
29 pages
Computational Linguistics
No ratings yet
Computational Linguistics
4 pages
Corpus Linguistics
No ratings yet
Corpus Linguistics
17 pages
Instant Access To Trust The Text Language Corpus and Discourse 1st Edition John Sinclair Ebook Full Chapters
100% (10)
Instant Access To Trust The Text Language Corpus and Discourse 1st Edition John Sinclair Ebook Full Chapters
70 pages
Corpus-Based Linguistic Approaches To Critical Discourse Analysis
No ratings yet
Corpus-Based Linguistic Approaches To Critical Discourse Analysis
9 pages
Corpus Lingustics
No ratings yet
Corpus Lingustics
24 pages
Today One Two
No ratings yet
Today One Two
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Session 14 - Computaional Linguistics

Uploaded by

Session 14 - Computaional Linguistics

Uploaded by

Applied Linguistics II

SEYED MOHAMMAD HOSSEINI

The theoretical goals of computational linguistics include:

◦ formulation of grammatical and semantic frameworks for characterizing

◦ the discovery of processing techniques and learning principles;

◦ the development of cognitively and neuroscientifically plausible

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.