0% found this document useful (0 votes)
1K views24 pages

Ikar Lab 3 Brochure

The new version of IKAR Lab 3 audio forensic software and hardware suite provides reliable tools for assessing audio authenticity, identifying persons from speech samples, and performing speech recognition. It features a unified interface, automated processes, and new methods for sound cleaning. The core software, SIS, includes visualization, editing, speaker separation, text transcription, and signal analysis tools to help experts make prompt assessments.

Uploaded by

santosh sitaula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views24 pages

Ikar Lab 3 Brochure

The new version of IKAR Lab 3 audio forensic software and hardware suite provides reliable tools for assessing audio authenticity, identifying persons from speech samples, and performing speech recognition. It features a unified interface, automated processes, and new methods for sound cleaning. The core software, SIS, includes visualization, editing, speaker separation, text transcription, and signal analysis tools to help experts make prompt assessments.

Uploaded by

santosh sitaula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

WWW.SPEECHPRO.

COM

IKAR Lab 3
A new generation
of the forensic audio analysis soſtware and hardware suite

Authenticity verification | Identification of persons | Speech recognition* | Sound cleaning


1
2 *available for Russian, English, Spanish, Kazakh, and Arabic
The new generation
of IKAR Lab 3

Meets the world-class requirements of leading forensic experts: it provides for


a highly reliable evidence base and automates the process as much as possible,
speeding up the work and simplifying the process.

All operations are performed in a unified work environment with a user-friendly


interface. This enables specialists to focus on decision making and providing
their expert assessments promptly.

Reliable methods for Reliable, high-speed New, fast, and precise


assessing the authenticity identification of persons, methods for sound
and examinability of an groups, and the lines cleaning
audio recording spoken by each

Quick access to the Processing of large


audio tracks of video volumes of audio
recordings recordings

3
Main components

SIS 5
Audio forensic software

Sound Cleaner 20
Noise reduction and audio enhancement software

Caesar
Audio transcription software 22

STC-H246
Audio hardware 23

4
SIS
Audio forensic software
SIS is the core software of the IKAR Lab 3 forensic audio kit. It includes powerful tools for
speech signal research and enhanced speech visualisation and analysis, including speech
segmentation, text transcription, automatic and semi-automatic identification tools and
many others.

SIS | Methods
Audio recording research for speaker identification based on speech samples
(formants and pitch of a speech signal analysis)

Methods
• Visualisation • Identification
• Editing and processing • Automatic comparison
• Detecting speech and noises • Comparison of formants
• Text transcription and speech • Pitch comparison
segmentation • Identification Wizard
• Separation of speakers in a dialogue/ • Overall conclusion
polylogue
• Analysis of an audio recording extracted
• Multi-window interface from a video file
• Signal comparison • EdiTracker and the diagnostic module
• Signal analysis
• Managing projects and creating reports
5
Visualisation
The algorithms used for the spectral representation of the signal ensure the highest
possible quality and clarity of visible speech. The user selects the optimal display
parameters on the fly or uses presets for various types of spectral analysis.

• Waveform • Pitch Extractor


• FFT and LPC spectograms • Formants extractor
• Medium and instant spectrum • Signal energy
• Cepstogram • Histogram and histogram correlation
• Autocorrelation

Editing and processing


SIS provides a wide variety of expert editing and signal processing tools that improve the
SIS | Methods

intelligibility of recorded speech and prepare audio recordings for further analysis.

• Amplitude normalisation • Bit depth conversion


• Linear transformation • Stereo separation and merging two
• DC Offset Suppression mono signals to stereo
• Mixing • Phase change
• Modulation • Adaptive inverse filter
• Tempo correction* • Adaptive tone suppressor
• Resampling • Adaptive broadband noise filter

Detecting speech and noises


The speech detector automatically marks speech fragments in the audio signal that are
suitable for identification. The module can also be configured to detect noisy areas: dial
tones, clipped fragments and clicks.

6 *Without pitch distortion


Text transcription and speech segmentation

The speech-to-text plugin allows to automatically obtain the text content of


a speech signal of an audio recording in Russian, English, Spanish, Kazakh, and Arabic.

NEW
Additionally, the transcription is accompanied by word-to-word segmentation indicating
the location of spoken words. This functionality allows the expert to work effectively
with large amounts of audio recordings.

In manual mode, selected audio fragments can easily be assigned to particular categories (e.g.,
different speakers, sounds or noises) with text comments while the general text will be exported
to MS Word. If there are two files of transcribed text, the programme can automatically search for
all matching words in the audio recordings compared.

SIS | Methods
Automatic text transcription with the segmentation of lines spoken by speakers

Separating speakers in a dialogue/polylogue


The module automatically marks lines according to speakers. Its reliability is up to 95% with
a signal-to-noise ratio of at least 20 dB and the duration of each speaker's speech of at least
16 seconds.

Using built-in algorithms, the module allows segmentation of the lines spoken
NEW

by up to 5 speakers.

7
Multi-window interface
SIS allows several audio files to be opened in one or several windows at the same time.
The windows can be positioned according to a particular task: vertically for identification
purposes or horizontally to compare copies of audio recordings or the various sound
cleaning options.
Signals can be opened in several layers in one window, and their colours and
transparency can be changed for better visualisation.
SIS | Methods

Working with audio recordings in a multi-window interface

Signal comparison
Windows can be connected according to time and spectral domain, which makes measurement
easier using vertical and horizontal cursors. The instant spectrum can be overlaid for better visual
comparison.
Pitch histograms can be compared visually or numerically using values of minimum, maximum,
median, asymmetry and general correlation.
8
Signal analysis
SIS automatically calculates the signal characteristics, based on which the expert
arrives at a conclusion if the recording is suitable for the identification analysis.

• Frequency response
• Signal-to-noise ratio
• Reverberation time
• Clipping and tonal noises
• Clear speech duration

Signal characteristics assessment

SIS | Methods
Working with projects and creating reports
IKAR Lab 3 organises the expert's workflow efficiently. The project opens files that are related
to examination directly from SIS, whether they are audio, text, video or photographic files.
These files and identification results can be saved in a structured way, as can reports created
in MS Word. The report can be supplemented with information on the settings for illustrations
and visible representations of speech, screenshots of the working screen or its area.

Identification
This unique tool based on biometric algorithms and expert modules is made to automate
and formalise the processes involved in audio forensics identification research: searching
for comparable words and sounds, selecting sounds and melodic fragments to be
compared,
comparing speakers’ formants and pitches, and performing speech analysis. The results
are presented as numerical indicators to contribute to the overall identification conclusion.
9
Automatic comparison
The module performs 1:1 voice signal comparison. The method it uses depends on the speech
signal characteristics of the audio recordings studied. All results are based on the extraction
of voice biometric traits and calculations regarding their similarity.

More methods of comparison: cxvector (a development on xvector) is used as the main


method, and, in addition, smart-speaker and gen6-v3 (when the clear speech content
NEW

in an audio recording is from 1.5 to 5 seconds). The new functionality offers faster and
more secure identification.

The module’s machine learning process involved tens of thousands of speakers to make the
engine train on the audio recordings made by speakers of different genders, ages, ethnicities, and
languages. The varied types of speech material were captured in various channels and in multiple
sound recording sessions. The high reliability of the biometric engine has been confirmed in NIST
testing.
SIS | Methods

Automatic identification results

10
Comparison of formants
The process of comparing formants with the module involves two stages.

1. Search and selection of reference sound 2. Expert comparison. The module


fragments for known and unknown automatically calculates FR, FA and LR for
speakers: the sounds selected and decides whether
• using the scatter plot with vowel triangle the outcome of identification is positive,
and highlighting the searching area negative or undefined
• specifying the frequency range of formants
search
• by the position of horizontal marks
indicating the limits in hertz and percentage
• using a graphical vowel chart

SIS | Methods

Speaker identification using the expert


method of formant comparison

Additional features:

• Visual comparison of selected sounds on • Specifying words or triads as textual


a vowel chart comments on reference fragments
• Comparison of the average formant values • Exporting tables of reference fragments
for selected sounds of two speakers and results to MS Word

11
Pitch comparison
The pitch comparison module compares the The algorithm generates results in the form of
specificities of speakers’ melodic patterns. a match percentage for each parameter and
The module enables melodic fragments to be delivers an overall identification/elimination
selected, attributes them to 1 of 18 possible conclusion or an inconclusive result. All data
melodic types and compares them according can be easily exported as text reports.
to 15 parameters, including maximum, average
and minimum pitch values, rate of pitch
change, skewness, kurtosis and others.
SIS | Methods

Speaker identification using pitch comparison

Identification wizard
This plugin offers a step-by-step identification process, displays the stages of research,
and visualises the results for any comparison made.

12
Overall conclusion
The outcome of each method can be saved in a given project. The programme is designed to bring
the results from each module into account when making an overall conclusion. The expert can
adjust the relative weight of each method in the overall conclusion or their significance can be
automatically assigned through a calculation of the qualitative and quantitative characteristics
of the audio recordings being compared. Based on the results, the expert can automatically
generate a detailed report.

Analysis of an audio track extracted


from a video

With the new SIS method, the expert gets immediate access to the audio track of a
video file without requiring any additional editors. Just, upload the video file and SIS will
automatically extract the audio track from the video and open it in a separate window.

NEW
The module allows work to be simultaneously done on a video in the video player and
an audio track in the editor. The video and the sound are synchronised, and the video is
automatically modified while the audio track is being edited.

Extraction and analysis of the audio track from a video

13
EdiTracker
The plugin performs diagnostics of the authenticity of analogue and digital audio recordings
and greatly simplifies expert analysis using SIS by providing the user with manual and
automatic analysis methods.
SIS | Methods

Authenticity check for the use of digital preprocessing of the audio recording

EdiTracker analysis methods


• Specifying the recording device parameters • Detecting traces of tampering through
• Identifying traces of previous digital signal phase shifts in the harmonics and phase
processing scanning
• Auditory analysis • Scanning background noise

14
Specifying the recording device parameters

Every analogue recording device has unique EdiTracker automatically assesses these
characteristics, such as frequency response, characteristics using a test signal. A mismatch
total harmonic distortion, pitch variation, between recording device parameters and
effective frequency range, tempo deviation, characteristics of a signal allegedly recorded
etc. with that unit may be an indication of
tampering.

Identifying traces of digital preprocessing

Digital processing of analogue signals always The vast majority of analogue-to-digital and
requires a specific sample rate. During the digital-to-analogue converters use anti-

SIS | Methods
digitising process, a phenomenon known as aliasing filters. EdiTracker automatically
aliasing occurs. Aliasing degrades the audio detects traces of such filters, the presence of
quality as high-frequency components are which may suggest that the audio has been
superimposed on low-frequency ones. digitised.

Detecting traces of tampering through phase shifts


in the harmonics
EdiTracker automatically scans audio for and estimates their phase continuity. An
technical narrow-band signals which normally unjustified phase break can be interpreted as
come from an electrical network (ENF), potential evidence of audio editing.
batteries, nearby electrical appliances, etc.,

15
Scanning background noise
Background scanning detects dramatic changes in the spectrum that are unnoticeable
on the waveform and which may be signs of audio editing.
EdiTracker also automatically scans the integrity of background noises and marks any
abrupt change in noise level.
SIS | Methods

Authenticity check for hidden editing of an audio recording based on the uniformity
of the background noise

Auditory analysis
During the playback of an original audio equipment and methods used can reveal
recording, the entirety of the audio possible violations in the intergrity of the
communication—including verbal and overall audio picture and identify the location,
nonverbal speaker output and additional facts and methods of such violations.
background interference—come together to EdiTracker provides an extended list of
form a complete and integrated picture of the auditory and linguistic indicators that may
audio and speech environment. indicate breaches in the authenticity of a
Auditory analysis of these events based on recording. These resources can be used to
the known characteristics of the recording create a textual report.

16
Diagnostic module

A new SIS module for a more reliable assessment of the authenticity and examinability
of an audio recording. The module detects various signal features that explain the nature
of its origin or possible processing methods, which may either be unknown or deliberately
hidden. In addition to EdiTracker, it detects the application of certain operations on a

NEW
signal using the following methods:
• Spoofing detection
• DC offset analysis
• Analysis of A/μ encoding traces
• Analysis of MP3 encoding traces

Spoofing detection
The spoofing detector searches for traces

SIS | Methods
of spoofing attacks in the audio recording,
such as replays, speech synthesis and
voice disguising. This algorithm is based on
a neural network trained on various types
of spoofing. As a result, it can conclude
whether or not the audio recording is
masquerading as the authentic recording of
a speaker.

Expert spoofing detection analysis

17
DC offset analysis
This module analyses the audio recording to identify any dramatic change in DC offset,
as this may be a sign of integrity violation. If such a violation is detected, the module
highlights the corresponding areas.

Detecting the disturbance of DC offset uniformity iin two areas


SIS | Methods

in the audio recording

Detection of A/μ coding


This module analyses the audio recording to detect areas with signs of A/μ encoding. The
possibility that an audio recording has been processed using these codecs is not indicated by
the recording format. In the event of the detection of such coding, the module highlights the
corresponding areas or the entire audio recording.

Detecting A/μ coding areas

18
Detection of MP3 coding
This module analyses the audio recording to identify signs of MP3 coding. The possibility that
an audio recording has been processed using this codec is not indicated by the recording format.
In the event of the detection of such MP3 coding, the module displays a message describing the
signs detected. Additionally, spectrograms, graphs and histograms are displayed, explaining the
decision made by the algorithm.

SIS | Methods

Detecting MP3 coding

19
Sound Cleaner
Noise reduction and audio enhancement software
Usually, the examination of audio recordings requires the creation of a verbatim record, or
transcription, thereof. Since audio recordings obtained in an operational context are often
recorded in difficult conditions and not readily intelligible, the first step is to clean the sound
of noise. To do this, the IKAR Lab 3 suite is optionally equipped with Sound Cleaner. It includes
modern signal processing algorithms that are effective at suppressing broadband noise, tonal
interference and pulses, while performing frequency response correction, equalising the signal,
etc.
Sound Cleaner

A waveform and stereo signal


processing scheme with filters and
parameters pre-set by an expert

Simultaneous display of the


spectrogram and waveform for
quick assessment of signal and
frequency processing needs

To determine the characteristics of noise and interferences, it is possible to build


NEW

spectrograms, including ones in 3D FFT format. This increases the speed and accuracy
of noise reduction.

20
All filters work in real time — the result can be heard immediately after the filter has been added
to the processing chain so that the user can select the optimal parameters by ear.

STC Auto Filter Cell phone Noise Filter


Significantly reduces the level of the most Reduces interference from the characteristic
common types of noise using a single intermittent sounds of incoming mobile phone
controller. calls.

Broadband Noise Filter Inverse Filter


Reduces the noise level from rooms and Equalises the frequency response of the
streets and interference from communication communication channel in which the recording
channels or recording equipment. Such was created. The filter has two settings:
noises take the form of a hiss and cannot amplification of the weak and suppression of
be suppressed by other methods, since the the strong spectral components of the signal
interference spectra intersect or coincide with (flattening the average spectra).
the useful signal spectra.

Tone Suppressor Reverb Suppressor


Suppresses the stationary narrow-band and Increases the intelligibility of speech,
regular interferences (vibrations, network decreases the level of reverberation in
pickup, noises from home appliances, slow recordings and reduces user fatigue by making

Sound Cleaner
music, the sound of a car passing, noise from the perception of a useful reverberated
water or a room, reverberation, etc.). speech signal easier despite the presence of
additional noise.

Click Suppressor Dynamic Range control


Automatically restores the speech or music Improves intelligibility when large drops occur
signals distorted by pulse noises (clicks, radio in signal level. For example, amplifying a weak
noises, knocks, crackles, etc.). signal and suppressing a strong signal to
balance the amplitude of the output signal.

Equaliser Clip Restorer


A 4096-band graphic equaliser with a built-in Restores overloaded recording fragments by
spectrograph for detailed spectrum correction reconstructing their waveforms.
in distorted recordings.

Reference Noise Suppressor DTMF Suppressor


Suppresses any noise from the main channel Processes the phone dialling signals, which is
present in the reference channel (for example, a sequence of short rectangular pulses with
TV or radio broadcastings, music, etc). dual-frequency filling.

Sound Cleaner saves its processing results in WAV format and automatically generates
comprehensive textual reports that log the process. The programme is compatible with
any sound editors using VST 3 format.

21
Caesar
Audio recording transcription module
The module is designed to produce a verbatim transcription of recorded speech. The text is output
into MS Word and automatically synchronised with the audio recording. This process simplifies
the search for the corresponding audio fragment and text editing. The ability to playback the
recording and transcribe it in a single, offline interface makes the expert’s work easier.
Caesar

Audio recording text transcription

22
STC-H246
Audio hardware
To guarantee the high quality of input and output signals, the IKAR Lab 3 suite is equipped
with a professional STC-H246 audio hardware device.

STC-H246
STC-H246 is perfect for setting up a workstation for digitising analogue audio recordings.
The device is designes to measure parameters and generating electrical signals in the audio
frequency range.

Parameter Value
Sampling rate 8–200 kHz
Resolution ADC/DAC 16-bit, 24-bit
Signal-to-noise ratio in the end-to-end channel, in the 105 dB
frequency band from 20 to 20 kHz
Input/output channel connector types XLR, RCA, S/PDIF, TRS 6.3
Number of channels 2
Power 110/220 V 60/50 Hz
Case Metal
Size 111×166×190 mm
OS Windows 7, 8, 10

23
About the company
Speech Technology Center is a global developer of products and solutions based on intelligent
speech technologies and voice and facial biometrics. Over the last 30 years, we have accumulated
highly sought-after expertise in artificial intelligence and machine learning.

We hold a leading position in the NIST, ASVspoof Challenge, VOiCES and CHiME world ratings.
So far, we have implemented projects for more than 5,000 clients in 75 countries.

Many of our solutions have been applied in the public and commercial sectors to great advantage,
from small expert laboratories to complex national security systems all around the world, includ-
ing those in the USA, Latin America, the Middle East and Europe.

Images in this document are for example purposes only and may differ from the actual product. Full
product specifications are available on request. Speech Technology Center Limited reserves the right
to modify product characteristics and/or withdraw any product without prior notice

St. Petersburg Moscow


Vyborgskaya Embankment 45, Bldg. E 3 Marksistskaya St., Bldg. 2
Unit 1-Н, Off. 133, 194044 109147
+7 812 325 8848 +7 495 669 7440
stc-int@speechpro.com stc-int@speechpro.com
24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy