Ikar Lab 3 Brochure
Ikar Lab 3 Brochure
COM
IKAR Lab 3
A new generation
of the forensic audio analysis soſtware and hardware suite
3
Main components
SIS 5
Audio forensic software
Sound Cleaner 20
Noise reduction and audio enhancement software
Caesar
Audio transcription software 22
STC-H246
Audio hardware 23
4
SIS
Audio forensic software
SIS is the core software of the IKAR Lab 3 forensic audio kit. It includes powerful tools for
speech signal research and enhanced speech visualisation and analysis, including speech
segmentation, text transcription, automatic and semi-automatic identification tools and
many others.
SIS | Methods
Audio recording research for speaker identification based on speech samples
(formants and pitch of a speech signal analysis)
Methods
• Visualisation • Identification
• Editing and processing • Automatic comparison
• Detecting speech and noises • Comparison of formants
• Text transcription and speech • Pitch comparison
segmentation • Identification Wizard
• Separation of speakers in a dialogue/ • Overall conclusion
polylogue
• Analysis of an audio recording extracted
• Multi-window interface from a video file
• Signal comparison • EdiTracker and the diagnostic module
• Signal analysis
• Managing projects and creating reports
5
Visualisation
The algorithms used for the spectral representation of the signal ensure the highest
possible quality and clarity of visible speech. The user selects the optimal display
parameters on the fly or uses presets for various types of spectral analysis.
intelligibility of recorded speech and prepare audio recordings for further analysis.
NEW
Additionally, the transcription is accompanied by word-to-word segmentation indicating
the location of spoken words. This functionality allows the expert to work effectively
with large amounts of audio recordings.
In manual mode, selected audio fragments can easily be assigned to particular categories (e.g.,
different speakers, sounds or noises) with text comments while the general text will be exported
to MS Word. If there are two files of transcribed text, the programme can automatically search for
all matching words in the audio recordings compared.
SIS | Methods
Automatic text transcription with the segmentation of lines spoken by speakers
Using built-in algorithms, the module allows segmentation of the lines spoken
NEW
by up to 5 speakers.
7
Multi-window interface
SIS allows several audio files to be opened in one or several windows at the same time.
The windows can be positioned according to a particular task: vertically for identification
purposes or horizontally to compare copies of audio recordings or the various sound
cleaning options.
Signals can be opened in several layers in one window, and their colours and
transparency can be changed for better visualisation.
SIS | Methods
Signal comparison
Windows can be connected according to time and spectral domain, which makes measurement
easier using vertical and horizontal cursors. The instant spectrum can be overlaid for better visual
comparison.
Pitch histograms can be compared visually or numerically using values of minimum, maximum,
median, asymmetry and general correlation.
8
Signal analysis
SIS automatically calculates the signal characteristics, based on which the expert
arrives at a conclusion if the recording is suitable for the identification analysis.
• Frequency response
• Signal-to-noise ratio
• Reverberation time
• Clipping and tonal noises
• Clear speech duration
SIS | Methods
Working with projects and creating reports
IKAR Lab 3 organises the expert's workflow efficiently. The project opens files that are related
to examination directly from SIS, whether they are audio, text, video or photographic files.
These files and identification results can be saved in a structured way, as can reports created
in MS Word. The report can be supplemented with information on the settings for illustrations
and visible representations of speech, screenshots of the working screen or its area.
Identification
This unique tool based on biometric algorithms and expert modules is made to automate
and formalise the processes involved in audio forensics identification research: searching
for comparable words and sounds, selecting sounds and melodic fragments to be
compared,
comparing speakers’ formants and pitches, and performing speech analysis. The results
are presented as numerical indicators to contribute to the overall identification conclusion.
9
Automatic comparison
The module performs 1:1 voice signal comparison. The method it uses depends on the speech
signal characteristics of the audio recordings studied. All results are based on the extraction
of voice biometric traits and calculations regarding their similarity.
in an audio recording is from 1.5 to 5 seconds). The new functionality offers faster and
more secure identification.
The module’s machine learning process involved tens of thousands of speakers to make the
engine train on the audio recordings made by speakers of different genders, ages, ethnicities, and
languages. The varied types of speech material were captured in various channels and in multiple
sound recording sessions. The high reliability of the biometric engine has been confirmed in NIST
testing.
SIS | Methods
10
Comparison of formants
The process of comparing formants with the module involves two stages.
SIS | Methods
Additional features:
11
Pitch comparison
The pitch comparison module compares the The algorithm generates results in the form of
specificities of speakers’ melodic patterns. a match percentage for each parameter and
The module enables melodic fragments to be delivers an overall identification/elimination
selected, attributes them to 1 of 18 possible conclusion or an inconclusive result. All data
melodic types and compares them according can be easily exported as text reports.
to 15 parameters, including maximum, average
and minimum pitch values, rate of pitch
change, skewness, kurtosis and others.
SIS | Methods
Identification wizard
This plugin offers a step-by-step identification process, displays the stages of research,
and visualises the results for any comparison made.
12
Overall conclusion
The outcome of each method can be saved in a given project. The programme is designed to bring
the results from each module into account when making an overall conclusion. The expert can
adjust the relative weight of each method in the overall conclusion or their significance can be
automatically assigned through a calculation of the qualitative and quantitative characteristics
of the audio recordings being compared. Based on the results, the expert can automatically
generate a detailed report.
With the new SIS method, the expert gets immediate access to the audio track of a
video file without requiring any additional editors. Just, upload the video file and SIS will
automatically extract the audio track from the video and open it in a separate window.
NEW
The module allows work to be simultaneously done on a video in the video player and
an audio track in the editor. The video and the sound are synchronised, and the video is
automatically modified while the audio track is being edited.
13
EdiTracker
The plugin performs diagnostics of the authenticity of analogue and digital audio recordings
and greatly simplifies expert analysis using SIS by providing the user with manual and
automatic analysis methods.
SIS | Methods
Authenticity check for the use of digital preprocessing of the audio recording
14
Specifying the recording device parameters
Every analogue recording device has unique EdiTracker automatically assesses these
characteristics, such as frequency response, characteristics using a test signal. A mismatch
total harmonic distortion, pitch variation, between recording device parameters and
effective frequency range, tempo deviation, characteristics of a signal allegedly recorded
etc. with that unit may be an indication of
tampering.
Digital processing of analogue signals always The vast majority of analogue-to-digital and
requires a specific sample rate. During the digital-to-analogue converters use anti-
SIS | Methods
digitising process, a phenomenon known as aliasing filters. EdiTracker automatically
aliasing occurs. Aliasing degrades the audio detects traces of such filters, the presence of
quality as high-frequency components are which may suggest that the audio has been
superimposed on low-frequency ones. digitised.
15
Scanning background noise
Background scanning detects dramatic changes in the spectrum that are unnoticeable
on the waveform and which may be signs of audio editing.
EdiTracker also automatically scans the integrity of background noises and marks any
abrupt change in noise level.
SIS | Methods
Authenticity check for hidden editing of an audio recording based on the uniformity
of the background noise
Auditory analysis
During the playback of an original audio equipment and methods used can reveal
recording, the entirety of the audio possible violations in the intergrity of the
communication—including verbal and overall audio picture and identify the location,
nonverbal speaker output and additional facts and methods of such violations.
background interference—come together to EdiTracker provides an extended list of
form a complete and integrated picture of the auditory and linguistic indicators that may
audio and speech environment. indicate breaches in the authenticity of a
Auditory analysis of these events based on recording. These resources can be used to
the known characteristics of the recording create a textual report.
16
Diagnostic module
A new SIS module for a more reliable assessment of the authenticity and examinability
of an audio recording. The module detects various signal features that explain the nature
of its origin or possible processing methods, which may either be unknown or deliberately
hidden. In addition to EdiTracker, it detects the application of certain operations on a
NEW
signal using the following methods:
• Spoofing detection
• DC offset analysis
• Analysis of A/μ encoding traces
• Analysis of MP3 encoding traces
Spoofing detection
The spoofing detector searches for traces
SIS | Methods
of spoofing attacks in the audio recording,
such as replays, speech synthesis and
voice disguising. This algorithm is based on
a neural network trained on various types
of spoofing. As a result, it can conclude
whether or not the audio recording is
masquerading as the authentic recording of
a speaker.
17
DC offset analysis
This module analyses the audio recording to identify any dramatic change in DC offset,
as this may be a sign of integrity violation. If such a violation is detected, the module
highlights the corresponding areas.
18
Detection of MP3 coding
This module analyses the audio recording to identify signs of MP3 coding. The possibility that
an audio recording has been processed using this codec is not indicated by the recording format.
In the event of the detection of such MP3 coding, the module displays a message describing the
signs detected. Additionally, spectrograms, graphs and histograms are displayed, explaining the
decision made by the algorithm.
SIS | Methods
19
Sound Cleaner
Noise reduction and audio enhancement software
Usually, the examination of audio recordings requires the creation of a verbatim record, or
transcription, thereof. Since audio recordings obtained in an operational context are often
recorded in difficult conditions and not readily intelligible, the first step is to clean the sound
of noise. To do this, the IKAR Lab 3 suite is optionally equipped with Sound Cleaner. It includes
modern signal processing algorithms that are effective at suppressing broadband noise, tonal
interference and pulses, while performing frequency response correction, equalising the signal,
etc.
Sound Cleaner
spectrograms, including ones in 3D FFT format. This increases the speed and accuracy
of noise reduction.
20
All filters work in real time — the result can be heard immediately after the filter has been added
to the processing chain so that the user can select the optimal parameters by ear.
Sound Cleaner
music, the sound of a car passing, noise from the perception of a useful reverberated
water or a room, reverberation, etc.). speech signal easier despite the presence of
additional noise.
Sound Cleaner saves its processing results in WAV format and automatically generates
comprehensive textual reports that log the process. The programme is compatible with
any sound editors using VST 3 format.
21
Caesar
Audio recording transcription module
The module is designed to produce a verbatim transcription of recorded speech. The text is output
into MS Word and automatically synchronised with the audio recording. This process simplifies
the search for the corresponding audio fragment and text editing. The ability to playback the
recording and transcribe it in a single, offline interface makes the expert’s work easier.
Caesar
22
STC-H246
Audio hardware
To guarantee the high quality of input and output signals, the IKAR Lab 3 suite is equipped
with a professional STC-H246 audio hardware device.
STC-H246
STC-H246 is perfect for setting up a workstation for digitising analogue audio recordings.
The device is designes to measure parameters and generating electrical signals in the audio
frequency range.
Parameter Value
Sampling rate 8–200 kHz
Resolution ADC/DAC 16-bit, 24-bit
Signal-to-noise ratio in the end-to-end channel, in the 105 dB
frequency band from 20 to 20 kHz
Input/output channel connector types XLR, RCA, S/PDIF, TRS 6.3
Number of channels 2
Power 110/220 V 60/50 Hz
Case Metal
Size 111×166×190 mm
OS Windows 7, 8, 10
23
About the company
Speech Technology Center is a global developer of products and solutions based on intelligent
speech technologies and voice and facial biometrics. Over the last 30 years, we have accumulated
highly sought-after expertise in artificial intelligence and machine learning.
We hold a leading position in the NIST, ASVspoof Challenge, VOiCES and CHiME world ratings.
So far, we have implemented projects for more than 5,000 clients in 75 countries.
Many of our solutions have been applied in the public and commercial sectors to great advantage,
from small expert laboratories to complex national security systems all around the world, includ-
ing those in the USA, Latin America, the Middle East and Europe.
Images in this document are for example purposes only and may differ from the actual product. Full
product specifications are available on request. Speech Technology Center Limited reserves the right
to modify product characteristics and/or withdraw any product without prior notice