Annotation Guidelines 2021.6.15
Annotation Guidelines 2021.6.15
Objective ......................................................................................................................................... 2
Steps ................................................................................................................................................ 2
Transcription Requirements................................................................................................................ 3
US Spelling ...................................................................................................................................... 3
Casing .............................................................................................................................................. 3
Titles/Honorifics ............................................................................................................................. 4
Punctuation .................................................................................................................................... 4
Numbers.......................................................................................................................................... 5
Times ............................................................................................................................................... 5
Contractions.................................................................................................................................... 5
Truncations ..................................................................................................................................... 6
OK vs. okay...................................................................................................................................... 6
Spacing ............................................................................................................................................ 7
Spelling Resources............................................................................................................................... 8
Dictionaries ..................................................................................................................................... 8
Pronunciation Assessment Annotation Guidelines
Objective
Correct errors in the reference text and annotate words to highlight mispronunciations.
Steps
Reference text will be presented for each hit in the ‘Transcription’ box. Your job is to:
1. Listen to the audio and correct any word errors in the reference text
3. Tag each mispronounced word with an <M> tag. You should place the tag after the
mispronounced word.
Pronunciation Tags
The M tag that you select (M1 through M5) will depend on the severity of the mispronunciation.
The higher the degree is, the more serious the mispronunciation is.
If the spoken word is unintelligible and it is therefore impossible to accurately transcribe the word,
you should retain the word in the reference text (even if clearly unsuitable) and follow it with the
<M5> tag. In this example, the reference has wrongly identified “many” as the unintelligible word:
If the audio is unintelligible but the reference text does not contain a word or words that can be
used, please make a best guess and tag the word(s) with <M5>.
Do not overuse the <M5> tag. It should only be used if a word is unintelligible or has been cut off
(see False Starts/Truncation guidance below).
Mispronounced words should be transcribed using their standard spelling. Do not try to capture
the way a word was spoken by modifying its normal spelling.
You can select <M> tags by either right clicking on your mouse to bring up the dropdown menu or
by using the F shortcut keys to select the desired tag - F1 = <M1>, F2 = <M2>, etc.
Transcription Requirements
US Spelling
Casing
Capital letters should be used to begin proper nouns. This includes names, places,
products, days of the week, months, etc. Capitals are also used for the personal pronoun
“I”, and for initialisms/acronyms/spelled-out letters (see guidance below).
Examples:
A capital letter is only needed at the start of a transcription if any of the above applies.
Initialisms/Acronyms/Spelled-out Letters
Common initialisms such as CD, DVD, and PC should be transcribed with upper case
letters and no spacing – CD, DVD and PC. Plurals should be transcribed with the upper
case ‘S’ and no space - CDS, DVDS and PCS.
Acronyms that are pronounced as words should be transcribed with upper case letters
and no space - eg NASA and NATO.
Spelled out words should be formatted with upper case letters and a space – the correct
spelling for her name is G R A C I E.
Titles/Honorifics
Titles should not be abbreviated. “Mr” should be “mister”, “Dr” should be “doctor”, “Mrs” should
be “missus” etc. When used before a name as an honorific, the first letter should be capitalized.
When used without the proper noun, it should not.
Examples:
Punctuation
Examples:
All numbers should be spelled out in full, as they are said. For instance, if a speaker says "104" as
"one oh four", this is what should be transcribed. Numbers should not be hyphenated.
Examples:
Times
Time should be spelled out in full. If verbalized, “a.m.” and “p.m” should be spelled as such.
Examples:
Contractions
Contractions that are common should be written as one word. The following is a list of common
contraction endings. If it is unclear whether the second word is contracted, transcribe each word
fully. For example, if it is unclear whether the speech is "he's" or "he is", transcribe it as the latter.
Symbols/Special Characters
Examples:
• I picked it up for $150 –> I picked it up for one hundred and fifty dollars
• we made a saving of 50% –> we made a saving of fifty percent
• 2 + 2 = 4 –> two plus two equals four
Informal Words
Examples:
Filler Words
Filler words such as “um” and “uh” should be transcribed if they are obvious and
prominent. Ignore filler words if they are not obvious.
False Starts
If you encounter a false start or multiple false starts, transcribe the false start as a complete word
and use the <M5> tag to highlight that the word was not heard in full.
Example:
Ignore minor noises/vocal stumbles that are not identifiable as a false start.
Truncations
If a word is cut off at the end of the audio but the word that the speaker intended to say is obvious,
transcribe the word using standard spelling and use the <M5> tag to highlight that the word was
not heard in full.
Example:
OK vs. okay
In order that the team’s transcriptions are consistent, please use “OK”.
Spacing
There should be a single space between words and before and after <M> tags.
Side Searching
Make use of side searching to help with understanding context or identifying names of places,
people, trademarks, etc.
Transcription Tagging
No transcription tags are used for this project. The <M> tag is the only tag that you will use.
Spelling Resources
Dictionaries
Lexico https://www.lexico.com
Merriam https://www.merriam-webster.com/
Dictionary.com https://www.dictionary.com/