Speech Analysis in Forensic Science Mine
Speech Analysis in Forensic Science Mine
SCIENCE
CONTENT
1. Definition
2. History
3. Forensic phonetics
4. Factors affecting individual’s speech
5. Forensic phonetics importance
6. Issues and solutions
7. Phonetic parameters
8. Speaker recognition
High / low
Verification / identification
Naïve / technical
9. Voice identification
Subjective / objective
10.Speech recognition
11.Voice print identification
12.Truth verification
13.Earwitness
14.Application
15.Articles
The voice is the very emblem of the speaker, indelibly woven into the fabric of
speech. In this sense, each of our utterances of spoken language carries, not only
its own message, but through ones accent, tone of voice and habitual voice
quality, it is also an audible declaration of our membership of a particular
regional group, of our individual physical and psychological identity, and of our
momentary mood. Thus, voice of an individual is said to be having its own
characteristics and distinct distinguishable quality.
HISTORY
Early history:
Saslove and Yarmey (1980) quote a translation of some writings by the pre-Socratic
philosopher Heraclitus and he writes “Eyes and ears are bad witnesses for men since
their souls lack understanding”. But the roman philosopher Quintillion (1899) proves
ear witnesses are helpful and he writes “The voice of the speaker is as easily
distinguished by the ear as the face is by the eye’ for speaker identification. Hoffman
(1940) quotes a good speaker must be a good man. It appears that speaker
identification existed as a recognized entity and did so from the time people began
writing down their opinions about human behaviors and capabilities.
Semi-modern times:
The aural perceptual testimony can be traced back even earlier to the year 1660, when
voice identification was offered in the case of William Hulet in Great Britain. Yarmey
(1995) commented a quotation by Jeremy Bentham who said ‘witnesses are the Eyes
and ears of justice’ & points out that sometimes these witnesses are accurate, complete
and trustworthy; sometimes they are not. Before the 19th century ended, there was
even talk about whether or not voices could be recognized over the telephone. 20 th
century – In 1933, Charles Lindberg’s he was international hero; his son got
kidnapped and found murdered. Lindberg identified the victim hearing his voice after
2 years. During the period of kidnap, Lindberg heard the victim / kidnapper’s voice
twice: once over the telephone and again in person.
In 1944, people became excited when word of bomb crackled around the world, to
know Hitler been killed or was he essentially unharmed. British and / or U.S agents
within Germany gave a contract to Dr Mack Steer to tap the telephone voices and
compare with the Speeches given by Hitler earlier in their lab. Using aural perceptual
procedures Steer and his associates, they could identify the voice similar to Hitler’s
speech and concluded that Adolf Hitler was alive and later intelligence proved them
correct. Gray and Kopp (1944) wrote an in-house report entitled voice print
identification. Immediately after the World War II, the Germans captured huge no of
loyal soviet soldiers and civilians in their drive towards the cities of Moscow,
Leningrad and Stalingrad and after the war was over, many of these Prisoners were
returned, to the USSR.
At this time, Stalin wanted to know who among the mass of people had been loyal and
who had not and how could he tell which was which? They split up the mass that were
put up in jail and assigned to a number of projects. One of the groups was asked to
create a workable procedure (Solzhenitsyn, 1968). When they were assigned to
determine which of 5 suspects who had committed crime against the state: They could
eliminate 3 out of 5 but between 2 members they couldn’t differentiate because of
similar speech characteristics.
During 21st century, a trend developed when the police started applying that procedure
they could get their hands on. It was during this period that personnel employed by
many of these agencies began to assume that ear witness line – ups (or voice parades–
It is where a person who had heard, but not seen, some individual attempts to identify
them by listening to their voice) would be just as effective as was visual identification.
Unfortunately they did not realize that there were substantial differences between the 2
approaches. For example they were not aware that memory for heard acoustic signals
could be quite variable, it was not the same as visual memory. Thus, police
departments were quite variable in how they developed ear witness identification and
hence, their use tended not be particularly rewarding. These concerns exist even to the
present day.
Forensic phonetics is the use of the principles of phonetics for legal purposes, and the
extension of phonetic research to investigations relevant to legal situations.
Forensic phoneticians can be defined as a professional speciality based in the utilization
of current knowledge about the communicative processes including the development of
specialized techniques and procedures for the purpose of meeting certain of needs of
legal groups and law enforcement agencies
Forensic phonetics consists of 2 general areas:
i. The first area involves the electro acoustical analysis of speech and voice signals
that have been transmitted and stored. It focused on problems such as the proper
transmission and storage of spoken exchanges, the authentication of tape
recordings, the enhancement of speech on tape recordings, speech decoding etc.
ii. The second major area involves the analysis (both physical and perceptual) of
communicative behaviors. That is it involves issues such as the identification of
speakers, the process of obtaining information relative to the physical or
psychological states and the analysis of speech.
In 1979, corsi said “A person voice is a complex signal which encodes various kinds of
information; among them reflects the anatomy and physiology of the speaker.”
HOW DOES ONE INDIVIDUAL’S SPEECH DIFFER FROM THAT OF
ANOTHER?
Organic vrs learned differences
Human vocal apparatus varies in size and shape due to which resonant frequencies and
the rate of vocal fold vibration are different for individuals therefore an individual’s
physique clearly influences how an individual sounds. Speech is more than a physical
event. A child learns more than a language due to which there is a marked variety of
pronunciation, this would extent up to early adulthood. Oraganic / learned dichotomy
fails to show the complexity of speaker individuality. In recorded speech the effect of
organic differences are convolved with the effect of what the speaker has learnt, in terms
of the linguistic system and the choices from it made at the given moment. State of ill
health which affects the vocal organs from minor colds through to cancer of the larynx,
change in sound . in shorter terms stress, fatigue, intoxication also affect. The variation
in the socilinguistics in the speech cannot be ignored in work on speaker indentification.
Individuality in the physical mechanism: speech as anatomy made audible
Speech analysis generally adopts a source filter theory od speech production, which
means that larynx is the source of acoustic energy and suprlaryngeal vocal tract is the
filter or resonator which shapes the energy. The range of vibration of the vocal tract
determines the mass and length of the vocal tract and also the shape of the glottal source
wave. Aanatomical irregularity (nodules, edema) may introduce irregularities in the
acoustic signal.
Due to the variation in the vocal tarct shape and length clear distinction between male,
female and child voice can be exhibited largyngeal vibration are not only the source of
acoustic energy in speech, the energy for fricatives are more likely to be generated by
the air turbulence. The precise acoustic property of this source depends upon the shape
and size of the teeth. This has been very uselful in work of speaker indentification.
The vacal tract filter consisting of pharyngeal and oral can be augumented by opening
of the velic port and coupling in the resonance of the nasal cavity. Nasal sounds have
often attracted the cues of speaker identity because the shape and sixe of nasal cavity is
both highly variable between speakers.
Individuality in the linguistic mechanism: speech as behavior
An act of speaking usually conveys nore than a bare message. What an individual tends
to think of a message will be accompaniedby signals that indicate the attitude of the
speaker, help regulate the flow of the conversation, reinforce a social relationship
between participants in an interation, present the speaker’s self image. To an exptend
the massage is conveyed and the linguistic mechanism is exploited. Most obviously, an
individual has an accent. This means that their speech allows them to be identified with
a group. Accents are differentiated along the parameters, which the speaker can control,
specifically segmental phonology, prosody, and aspects of voice quality. Similarly on
the geographical; dimension, dialectologist have traditionally delighted in separating
smaller and smaller groups of speakers on the basis of pronunciation
Individuality in phonetic implementation
There may be tight constraints imposed by a language on the auditory effects associated
with a consonant and with a following vowel, but the transition between the 2
articulations and hence the co-articulation and hence the co articulation of the 2
segments may allow for variation. The tongue position for the vowel might adapt
relatively early making the consonant’s secondary articulation highly dependent on the
quality of the vowel or relatively late producing the more of transition in the early part
of the vowel. Such between speaker variation was demonstrated by Su, Li and
Fu(1974)for nasal plus vowel and Nolan (1983)for lateral plus vowel sequences.
Speaker outside the normal range
Virtually by definition the speaker who are most distinctive arethose who lie outside the
normal range. Speech people are usually who have speech aproblems of varying degree
of severity or whose command of a language is non-native. Speech phenomena outside
the normal range are however highly valuable in speaker identification. I for instance
acoustic measurement and phonetic analysis of accent lead to a conclusion that 2
recording are of the samemale speaker and that there is no more than 10 % chance that
they were from doffernt speaker th e prensence of a similar stutter in both
recordingcould reduce that to less than .2%. this is because rouhgh;y one man In 50 are
stutterers
Between-speaker & within-speaker variation
Variation between speaker is larger than the variation within speaker. So greater the
ratio of between- to within-speaker variation, the easier the identification.
Multidimensionality
Voices are multidimensional objects. A speaker’s voice is potentially
characterisable in an exceedingly large no of different dimensions.
Reduction in dimensionality
Since the ability to discriminate voices is clearly a function of the available
dimensions, any reduction in the number of available dimensions constitutes a
limitation. In addition to reduction, the distortion is a problem associated with
the real world, occurs commonly in telephone transmission (Rose & Simmons,
1996; Kunzel, 2001).
IMPORTANCES
Provides support just as it does for clinical areas of speech and voice
Can be used to assist in military industrial and security organization
The forensic phonetics interface is primarily with the criminal justice and judicial
systems
Speaker identification.
Transcription or content identification (phonetic expertise is used either
to create a transcript of a recording or to give evidence as to the reliability
of such a transcript or determining what was said when recordings are of
bad quality, or when the voice is pathological or has a foreign accent].
Speaker Profiling [in the absence of a suspect, saying something about
the regional or socioeconomic accent of the offender’s voice(s)].
The Construction of voice line-ups and tape authentication (determining
whether a tape has been tampered with).
Language or accent identification (phonetic expertise is used to give
evidence as to the likely place of origin of a particular speaker).
Analysis of forensic evidence is used in the investigation and prosecution
of civil and criminal proceedings. Often, it can help to establish the guilt
or innocence of possible suspects.
Forensic evidence is also used to link crimes that are thought to be related
to one another. Linking crimes helps law enforcement authorities to narrow
the range of possible suspects and to establish patterns of for crimes, which
are useful in identifying and prosecuting suspects.
Forensic scientists also work on developing new techniques and
procedures for the collection and analysis of evidence. In this manner, new
technology can be used
Phoneticians and speech scientist has a major role in contributing to these
areas.
It is important for phoneticians and speech scientist to be closely aware of
the practices in their countries ‘legal systems.
The tape recorders generated for legal enforcement purpose are rarely high fidelity
indeed, they often are of rather limited quality. The two main sources of the difficulty
are distortion and noise. Both result primarily from use of inadequate equipment, poor
recording techniques or events occuring with in the acoustic environment with the tape
recordings were made.
The degradation of the speech intelligibility by and within equipment can result from
3. REMEDIES
First, it should be acknowledged that a number of scientisits and enginers have been
developing fairly sophisticated machine techniques designed to reconstruct degraded
speech. They employ approaches such as band width compression, cross-channel
correlation, mean least squares analysis, all-pole models, group delay functions, linear
adaptive filtering linear predictive coiefficent, cepstrum techniques, devolution.
Linguistic Vs non-linguistic:
There are 2 principal methods of short-term spectral analysis that are used- filter
bank analysis, & LPC analysis.
Other measurements that are often carried out are correlated with prosody such
as pitch & energy tracking. A pitch or periodicity measurement is relatively
easy & is meaningful only for voiced speech sounds so it is necessary to have a
detector that can discriminate voiced from unvoiced sounds. This complication
often makes it difficult to obtain reliable pitch tracks over long- duration
utterances. LTAS & FF measurements have been used in the past for speaker
recognition, but since these measurements provide feature averages over long
durations they are not capable of resolving detailed individual differences.
Nolan (1997) explained there are two classes of speaker recognition task-
identification and verification.
Speaker Identification: Includes the usual forensic situation in the standard use
of the term, the circumstances are in many ways rather more difficult. Speaker
identification is often thought of as involving a difference from speaker
verification because the task may be to match an unknown sample to one of the
closed set of suspect and so all that is needed is to determine which suspect’s
speech is nearest the unknown speaker’s sample. Speaker identification is the
one where an utterance from an unknown speaker has to be attributed, or not, to
one of a population of known speakers for whom reference samples are
available. Speaker identification are mainly divided into 3 which are as
follows:
Lass (1976) reported that speaker sex identification judgments made for 96%
were correct for the voiced tape, 91% were correct for the filtered tape, and 75%
were correct for the whispered tape. Findings indicate that the laryngeal
fundamental frequency appears to be a more important acoustic cue in speaker
sex identification tasks than the resonance characteristics of the speaker. Results
of the experiment conducted by Reich (1979), suggest that certain vocal disguises
markedly interfere with speaker identification by listening. The reduction in
speaker identification performance by vocal disguise ranged from naïve listeners
is 22.0% (slow rate) to 32.9% (nasal) and sophisticated listeners is 11.3% (hoarse)
to 20.3% (nasal). Nasal disguise was the most effective.
Speaker identification by listening only, one of the methods discussed is, far from
being 100% accurate. It is an entirely subjective method; an expert witness using
only this method would be unable to justify his conclusions in a court of law.
2. Speaker identification by machine:
In the Automatic method, the computer does all the work and the
participation of the examiner is minimal. For the purpose of automatic
identification, special algorithms are used which differ based on the phonetic
context. This method is used very often in forensic sciences but factors such as
noise and distortion factors of voice and other samples need to be controlled. In
such case a combination of subjective and objective methods should be used.
(Tosi, 1979) Under speaker identification three types of recognition tests can be
carried out
- closed tests,
- open tests,
- discrimination test
Bolt (1970) reported that speech spectrograms, when used for voice
identification, are not analogous to fingerprints, primarily because of fundamental
differences in the source of patterns and differences in their interpretation. To
Assess reliability of voice identification under practical conditions, whether by
experts or explicit procedures not yet been made, requirements for such studies
outlined. Hecker (1971) reported that speaker recognition by visual comparison
of spectrograms is coming into use in criminology, but the validity of this method
is still in question.
Reich (1976) reported that the examiners were able to match speakers with a
moderate degree of accuracy (56.67%) when there was no attempt to vocally
disguise either utterance. In spectrographic speaker identification nasal and slow
rate were the least effective disguises, while free disguise was the most effective.
This recognizes speakers by their voices where “normal every day abilities” are
used and is performed by untrained observers in real-life conditions (Nolan,
1983). The recognition is performed by untrained observers whether in course
of normal everyday life for in instance while answering a telephone or hearing a
voice in the next room, or in the more dramatic circumstances of the crime.
Rose (2002) reported that one major difference between automatic speaker
verification/identification and forensic speaker identification is that in
verification and identification the set of speakers that constitutes the reference
sample is known, and therefore the acoustic properties of their speech are
known. In forensic speaker identification, the reference set is not known, and
consequently the acoustic properties of its speakers can only be estimated
(Broeders, 1995). Another difference is in degree of control that can be
exercised over the samples to be compared. A high degree of control means a
high degree of comparability, which is conducive to efficient recognition (in
case of speaker identification and verification).
Itoh (1992) the vocal tract spectral envelope was found to have the largest effect
on speaker identification.
Lavner, Gath and Rosenhouse (2000) results suggested that on average, the
contribution of the VT features to the identification process for the vowel /a/ as
cues to familiar speaker identification is more important than that of the glottal
source features. This suggests that for each speaker a different group of acoustic
features serves as the cue to the vocal identity.
Speaker signal carries information from many sources. But not all information
is relevant or important for speaker recognition. In speaker recognition, the 1st
crucial step is the feature extraction, where the speech signal of a given frame is
converted to a set of acoustic features with the hope that these features will
encapsulate the important information that is necessary for recognition. Once
these features are computed a back end classifier is used to recognize the input
speech signal into a sequence of words in light of the extracted features and pre-
trained models.
Reich (1981) reported that people could usually tell when the speaker is
attempting voice disguise. Indeed, it is very difficult to consistently disguise
ones voice over long periods of time. If the sample is short, the problem in
detecting disguise is severe. If the sample is long, there may be ways to
identify, which parts are normal and which are not.
VOICE IDENTIFICATION
There are subjective and objective methods of voice identification. The
subjective procedure are based on either audio or visual comparisons of signals,
while in objective procedures, a computer usually compares the visual
representation of an audio signal from 1 or more speakers. There are 2 methods
of examination
Aural
Spectrographic
In Aural examination of recorded voice, a listener may use LTM process or the
Mc Gehee (1937) studied memory decay for voices and found that decay in
correct identifications occurred over time. She also reported that male auditors
could be expected to perform at levels better than those for women. Bull and
Clifford (1984) reported that females performed better than males in a task of
speaker identifications.
Another important and well-documented fact is that some voices are identified
better than others (papcun, kreiman and Davis, 1989; Rose and Duncan 1995)
and it can therefore be assumed that some voices carry more individual
identifying content than others. It may also be case that a voice is badly
identified because it has a wide range of variation that takes it into the ranges of
other voices (Rose and Duncan, 1995).
2) Open Vs closed tests – in open test the observer was not told whether the
unknown speaker was represented in the known set, but in closed tests he
was told that the unknown speaker was represented in the set.
3) The context of the speech materials –whether the test words were spoken
either in isolation or in sentences.
Three main types of effect arise due to telephone speech. They are:
The physical setting in which a telephone call is made, generally referred to as,
environmental effects, For e.g., the context of high degree of background noise
such as traffic.
Recordings of telephone interactions will of course include sounds resulting
from any background activity, which may pose problems for forensic analysis
of speech since the background noise may obscure some of the crucial
information in the speech signal (Hirson and Howard 1994).
From speakers modifying their behavior as a result of speaking into a telephone,
called as speaker effects. Behavioral differences may be conscious or
subconscious depending on the individual and/or the interactional setting. It has
been shown experimentally that many individuals speak more loudly when
using a telephone, probably as a subconscious reflex to circumvent
environmental factors such as background noise (Summers et al. 1988).
Sreedevi, Pooja & Mahima (2008)- compared the spectral and the temporal
parameters between original and disguised voice using mobile phones in 7 normal
native Hindi speaking adult males. The acoustic analysis included spectral
parameters such as the first three formant frequencies of the vowels /a/, /i/ and
/u/. The temporal parameters included word duration, nasal murmur, frication
duration, burst duration, voice onset time and closure duration. The results
showed that lowered formant frequencies, No significant difference was seen for
most of the temporal parameters (except nasal murmur) across both the
conditions.
Neha .M (2008) studied mean length of utterance used for SPID in high-pitch
disguise condition & results showed that there was no significant difference b/n
normal & high-pitch disguise condition on the accuracy of identification, sig
difference b/n conditions on the no of syllables required to correctly identify
speakers, & min of 5-6 syllables are required for SPID in high-pitch disguise
condition.
Karthikeyan (2008) studied mean length of utterance used for SPID in hoarse
voice disguise condition & results showed that there was no significant
difference b/n normal & hoarse voice disguise condition on the no of syllables
required to correctly identify speakers & an average of 8.73 syllables are
required to correctly identify speakers.
SPEECH RECOGNITION:
Speech recognizer is a device that converts an acoustic signal into another form,
like writing to be stored or used in some way. Speech recognition device
accepts acoustic signals as input and produces sequences of words as output.
The observers try to perform a recognition task (i.e.) they try to match the
spectrograms that represent the same speaker and are instructed to examine
features such as,
They were less successful when the same techniques were applied to make them
independent of task, vocabulary and speakers. This led researchers to look into
the basic problem of speech organization. The basic problem is the paradox
consists of a continuous stream of sounds with no obvious disconnection at the
boundaries between different sounds and yet speech is perceived as a set of
discrete symbols.
The problem is to segment the speech onto linguistic units such as words or sub
word units and identification need to be looked upon as two distinct entities or a
single intergraded process.
1) The acoustic signal viewpoint –since speech signal is waveform (or vector
of numbers), we can simply apply general signal analysis techniques (e.g.
Fourier frequency spectrum analysis, principal component analysis,
statistical decision procedures, & other mathematical schemes) to
establish the identity of the i/p.
2) The speech production viewpoint- we understand the communication
‘source’ of the speech signal, & capture essential aspects of the way in
which speech was produced by the human vocal system (e.g. rate of
vibration, place & manner of articulation, coarticulatory movements, etc).
3) The sensory reception viewpoint- suggests duplicating the human auditory
reception process, by extracting parameters & classifying patterns as is
done in the ear, auditory nerves, & sensory feature detectors of the ear-
brain system.
4) The speech perception viewpoint- suggests we extract features & make
categorical distinctions that are experimentally established as being imp to
human perception of speech (e.g. VOT, formant transition, etc).
Testing the performance of different algorithms and systems has become a
major issue in the development of ASR systems (Pallet, 1995), since systems
are costly and are very divergent quality and task applicability
- The second factor in determining voice uniqueness lies in the manner in which
the articulators or speech muscles are manipulated during speech. The articulators
include the lips, teeth, tongue, soft plate and jaw muscles where controlled
interplay produces intelligible speech. Intelligible speech is developed by the
random learning process of imitating others who are communicating. The
likelihood that 2 people could develop identical use patterns of their articulators
also appear very remote.
To facilitate the visual comparisions of voices, a sound spectrograph is used to analyze
the complex speech wave form into a pictorial display on what is referred to as
spectogram. The resonace of the speakers voice is displayed in the form of vertical
signal impressions or markings for constant sounds, and horizontal bars or formats for
vowel sounds. The spectograms serve as permanent record of the words spoken and
facilitate visual comparision of similar words spoken between and unknown and known
speakers voice. The investigator should attempt to select a reasonably quiet environment
for controlled activities as drug or other illegal operations being investigated. It may
require the recording of telephone conversations or face to face encounters under a
variety of acoustic conditions in which some one is wearing a body recorder or
transmitting the conversation via radio frequency to a remote location. Unfortunately in
many cases the investigators cannot control the acoustic environmet. Speech samples
obtained should contain exactly the same words and phrases as those in the questioned
sample because only like speech sounds are used for comparision. If the caller voice
was disguised ,the suspect should give a normal sample and a disguised one as in the
questioned call. Recorded evidence should be wrapped in tinfoil to protect in from
possible contact with a magnetic field if it is submitted by mail. Both an aural (listening)
and visual examination and comparision is conducted. Visual comparision of
spectrograms involves in general,the examination of spectograph features of like sounds
as portrayed in spectograms in terms of time, frequency,and amplitude.
Following are the basic principles of spectrogram reading
6. Identify voicing
7. Identify frication
8. Identify aspiration/affrication
9. Identify phoneme
10.Join phonemes to make a word in a language.
Indentify the beginning and end of a sentence: The sentences usually has a pause.
therefore ,we can identify the end and beginning of sentences with pauses.
Identify the beginning and end of a word: Difficult to find the beginning and end os a
word,one can segment words when read. Vowel in word- final position is lengthened
and has low intensity. So Identifying the beginning and end of a word starts with low
intensity / lenthend vowel in word-final position.
Identify the probable beginning and end of phoneme: If the phoneme of concern is
vowel/semivowel/fricatives/ diphthongs /nasal continuants, the beginning and end of
phoneme is marked by onset and offset of resonance bars. If it is a voiced plosive, the
beginning is marked by onset of voice bars and the onset is marked by beginning of
resonance bars for the following vowel, burst if plosive is released. An unvoiced
plosives-silence followed by burst,unvoiced fricatives-the onset and offset of frication.
Identify the manner of articulation: A vowel is charectherized by resonance bars,as they
are open sounds. Semivowel –articulatory movemnet from one vowel to another is fast
whereas in dipthongs its slow. A fricative –energy at high frrquencies. Nasal
continuents-damped resonance. Laterals-high F2 &F3. Trills- burst like striations are
visible on spectrograms and when it is tap, silence varying from 10 ms to 40 ms are
seen. A plosive can be identified by a silence followed by a burst or voice bars followed
by burst. Affricates-silence/voice bars ,burst and affrication characterized affricates
Identify place of articulation: Bilabial bursts tend to have a low frequency dominance,
b/w 500 and 1500Hz. The palatals and velars are characterized by a mid frequency burst,
b/w 1500 and 4000Hz. The alveolars are associated with high frequency energy, above
4000Hz.
TRUTH VERIFICATION
Truster pro is an innovative, highly advanced computerized system that is specially
designed to provide you with easy access to truth verification in highly professional and
discrete manner. Conversation can be analysed in real time off line to provide timely
and reliable data for making the right decisions. It is based on the technology of vocal
stress analysis caluclated from a series of complex sophisticated algorithms that detect
states of stress, and then measure and grade them accordingly. Truster pro technology
pin points the cause of stress, and reports back with a determination as to weather a
speakers stress is caused by lie excitement as exaggeration or cognitive conflict.
A lie detector is a tool designed for the purpose of determining ones level of the
truthfulness. The basic idea in all of the existing machines today is to monitor
involuntary body reactions, to determine and analyse the subjects state of fear,stress,
and arousal. Other types of lie detectors use vocal stress to determine the level of
honesty by measuring the stress in the voice. These lie detectors use a technology known
microtremors stress detection and analyse the stress indicative to deception. Because
there are many types of lies there is no set of voice pattern or frequency for deceptive
speech. However there is a uniform appearance for truthful situations where the main
stream thought process is fluent and uninterrupted . The trusterpro system is a
combination of three different vocal lie detectors which include;
THE ON LINE MODE
1.This allows you to conduct interrogations where analysis is required in real time and
allows you to focus better on suspected portions and ask additional question, if
necessary.
2.A telephone conversation or normal discussion can be analysed in this mode.
THE INTERROGATION MODE
This is equivalent to the traditional poly graph system, providing quick and
computerized summaries and reports using all familiar poly graph techniques of
interrogation.
THE OFF LINE MODE
This can be analysing prerecorded material to produce an indepth psycholigical
structure view. Truster pros technology uses there psychological patterns to distniguish
between stress resulting from excitement or any other emotional stress,confusion or any
other cognitive stress, global stress resulting from the circumstances and deceptive
stress. Some courts will permit witness to identify a speaker only if they can satisfy the
presiding jurist that they really known that person. This approach is a reasonable one as
it is consistent with relevant research. That is if a witness has been in close contact with
a speaker for a long period of time , he or she probably can recognize the speakers voice.
And be fairly accurate in doing so.More over many courts also permit a qualified
specialist to render an opinion after comparing a sample of the unknown talkers speech
to an appropriate recording. Here the professional conducts an examination and then
decide if two talkers are involved or only one. But before either of these approaches is
described ,yet a third type of aural perceptual speaker identification should be
considered. It involves ear witness line up or voice paders,
Forensic significance:
Vocal cord activity: parameters associated with vocal cord activity –
pitch & phonation type are considered very important in forensic
phonetics. Becoz they can be used extra linguistically, as a
characteristic of the speaker.
Phonetic quality & voice quality: the voice quality and phonetic
quality for forensic speech samples can differ in 4 ways-same voice
and phonetic quality, different voice and phonetic quality, same voice
but different phonetic quality, and same phonetic but different voice
quality. In Naïve voice discrimination, the voice quality is the
dominant factor. Ex., hello ( // or /a/).