0% found this document useful (0 votes)
14 views45 pages

5ARB0 Lecture 20240910

Uploaded by

dasagapadel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views45 pages

5ARB0 Lecture 20240910

Uploaded by

dasagapadel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

5ARB0 Data acquisition and analysis 2024 – 2025

Technical aspects of data collection


5ARB0: DATA ACQUISITION & ANALYSIS (2024 – 2025)

Uzay Kaymak, Jheronimus Academy of Data Science, u.kaymak@tue.nl

Master: Artificial Intelligence & Engineering Systems

U. Kaymak 1
5ARB0 Data acquisition and analysis 2024 – 2025

Outline

• Aspects of data collection


• Sources of data
• Signal conditioning
• Data storage
• Experiment design
• System excitation
• Sampling

• Data conditioning

U. Kaymak 2
5ARB0 Data acquisition and analysis 2024 – 2025

Recap

• Data has multiple facets


• Distinction between primary data collection and secondary data collection
• In engineering systems, primary data collection has focus
• Data science deals also with secondary data collection

• Capturing context information is very relevant

Data collection: process of gathering data for use in (business)


decision-making, strategic planning, research and other purposes

U. Kaymak 3
5ARB0 Data acquisition and analysis 2024 – 2025

Categories of data

• Attribute – value pairs


• Matrix/table representation

• Unstructured data
• text

• Sequence data
• Time series
• Event logs

• Graph data
• Non‐linear data represented as nodes and vertices

U. Kaymak 4
5ARB0 Data acquisition and analysis 2024 – 2025

Attribute – value pairs (fields in a record)

Source: Centers for Disease Control and Prevention (CDC) https://www.cdc.gov/


Year Pills Insulin Insulin Any No
Diabetes Patients Medication Status Only Only and Medication Medication
Pills

U. Kaymak 5
5ARB0 Data acquisition and analysis 2024 – 2025

Unstructured data
“ … From 1997 to 2011, the number of adults aged 18
years or older with diagnosed diabetes who reported
Source: Centers for Disease Control and Prevention (CDC)

taking diabetes medication increased for those taking


Diabetes Patients Medication Status

either insulin, pills, or both. The number of adults with


diagnosed diabetes who did not report taking diabetes
medication also increased during the period. For those
https://www.cdc.gov/

taking insulin only, trends showed little or no change until


2007 and increased afterwards. …”

U. Kaymak 6
5ARB0 Data acquisition and analysis 2024 – 2025

Sequence data

Time series Activity logs Gene sequence

U. Kaymak 7
5ARB0 Data acquisition and analysis 2024 – 2025

Graph data

Information represented
as nodes and vertices

U. Kaymak 8
5ARB0 Data acquisition and analysis 2024 – 2025

Aspects of data collection

Illustration: iStock Royal Geographical Society

U. Kaymak 9
5ARB0 Data acquisition and analysis 2024 – 2025

Data sources

• Sensors
• Surveying and monitoring
• Audio
• Video

• Surveys and questionnaires


• Interviews
• Etc.

10

U. Kaymak 10
5ARB0 Data acquisition and analysis 2024 – 2025

Examples of common data collection methods

• automated data collection from business applications, websites and


mobile apps
• sensors that collect operational data from equipment, vehicles, etc.
• data from external data sources (e.g. data streams)
• tracking online channels (e.g. social media, discussion forums, twitter)
• surveys, questionnaires and forms (online, in person, by phone, etc.)
• focus groups and one-on-one interviews
• direct observation of participants in a research study

11

U. Kaymak 11
5ARB0 Data acquisition and analysis 2024 – 2025

Observing the analogue world – sensors

www.techbriefs.com

12

U. Kaymak 12
5ARB0 Data acquisition and analysis 2024 – 2025

Data acquisition system (simplified)

ADC: analogue – digital converter


DAC: digital – analogue converter

13

U. Kaymak 13
5ARB0 Data acquisition and analysis 2024 – 2025

Transducers Examples

Convert a physical quantity into • Temperature sensors


another
• Force and pressure transducers
Properties
• Magnetic field sensors
• Sensitivity
• Ionizing radiation sensors
• Stability
• Displacement (position) sensors
• Noise
• Fiber optic sensors
• Dynamic range
• MEMS (micro electromechanical
• Linearity systems)

14

U. Kaymak 14
5ARB0 Data acquisition and analysis 2024 – 2025

Signal conditioning

Modifying the (analogue) signal • Filtering


before being processed • Remove noise
(e.g. digitizing) • Isolate relevant/interesting signal
• Amplification • Anti‐aliasing
• Smoothing
• Attenuation
• Surge protection
• Input coupling
• (Electrical) isolation
• Excitation
• Etc.
• Linearization

15

U. Kaymak 15
5ARB0 Data acquisition and analysis 2024 – 2025

Common filter types

16

U. Kaymak 16
5ARB0 Data acquisition and analysis 2024 – 2025

Data storage

• Direct writing to hardware (embedded


systems, low‐level OS access)
• Text files
• CSV (comma separated value) file (flat
file)
• Markup file (e.g. XML, JSON)
• Spreadsheets
• Relational databases
• NoSQL databases (for graph data)

17

U. Kaymak 17
5ARB0 Data acquisition and analysis 2024 – 2025

Access to disk

18

U. Kaymak 18
5ARB0 Data acquisition and analysis 2024 – 2025

Text files

CSV file XML JSON

19

U. Kaymak 19
5ARB0 Data acquisition and analysis 2024 – 2025

Spreadsheets

20

U. Kaymak 20
5ARB0 Data acquisition and analysis 2024 – 2025

Relational database

21

U. Kaymak 21
5ARB0 Data acquisition and analysis 2024 – 2025

NOSQL database

22

U. Kaymak 22
5ARB0 Data acquisition and analysis 2024 – 2025

Raw data

• Store as much as possible raw data (actual “measurements”)


• Raw data is often atomic or transactional
• For analysis, derive the required features from raw data

• Consider: system to analyze characteristics of successful gamers by studying


the time they spend in various components
• Time spent on results
• Time spent on configuration, etc.

Question: Which data to store?

23

U. Kaymak 23
5ARB0 Data acquisition and analysis 2024 – 2025

Data warehouses and data marts


Monitor
Metadata & OLAP Server
Other
sources Integrator
Analysis
Operational Extract Query
DBs Transform Serve
Data Warehouse Reports
Load
Data mining

Data Marts

Data Sources Data Storage OLAP Engine Front-End Tools

Source: Jiawei Han

U. Kaymak 24
5ARB0 Data acquisition and analysis 2024 – 2025

Reducing storage size

Compression
• Lossless (e.g. zip files): no information is lost
• Lossy (e.g. mpeg4 compression): some details/information may be lost

Reduced resolution
• Small bit resolution (smaller number of bits)
• Maximizing dynamic range
• Delta encoding
• Huffman encoding
• Run‐length encoding

25

U. Kaymak 25
5ARB0 Data acquisition and analysis 2024 – 2025

Delta encoding

26

U. Kaymak 26
5ARB0 Data acquisition and analysis 2024 – 2025

Huffman encoding

27

U. Kaymak 27
5ARB0 Data acquisition and analysis 2024 – 2025

Experiment design (which data and how to collect?)

Cross‐sectional data (static) Whenever possible, primary data collection,


• Population sampling where there is control over collected data
should be preferred.
• Active learning

Longitudinal data (dynamic)


• System excitation
• Sampling frequency

28

U. Kaymak 28
5ARB0 Data acquisition and analysis 2024 – 2025

System excitation Sampling frequency

• System should see sufficient • If the system is bandwidth limited,


relevant examples use the Nyquist rate to reconstruct
the signal
• Rich in types/regions of input
Nyquist rate = 2 * bandwidth
• Linear systems (open loop):
• Pseudo‐random binary sequence
(all frequencies present in signal)

• More complex for nonlinear


systems

29

U. Kaymak 29
5ARB0 Data acquisition and analysis 2024 – 2025

Data conditioning

Modifying the (digitized) data for analysis, easy storage and/or further
processing
• Decomposition (e.g. frequency analysis)
• Aggregation
• Smoothing
• Interpolation
• Normalization
• Synchronizing

30

U. Kaymak 30
5ARB0 Data acquisition and analysis 2024 – 2025

Decomposition

Represent the signal as a superposition of constituent parts


• Fourier transform
• Z‐transform
• Wavelets

31

U. Kaymak 31
5ARB0 Data acquisition and analysis 2024 – 2025

Aggregation

Replace multiple values by a single value


• Mean
• Median
• Minimum
• Maximum
• Etc.

32

U. Kaymak 32
5ARB0 Data acquisition and analysis 2024 – 2025

Smoothing
Introduces a phase shift (lag)
Moving average filters
• p‐period simple moving average
(mean of last p samples)
• p‐period weighted moving average
(weighted mean of last p‐samples)
• A common smoothing filter is the
exponentially weighted moving average
y’(t) = a y(t) + (1 – a) y’(t‐1); 0 < a < 1

33

U. Kaymak 33
5ARB0 Data acquisition and analysis 2024 – 2025

Interpolation

Linear table lookup

Polynomial interpolation

Splines

Etc…

34

U. Kaymak 34
5ARB0 Data acquisition and analysis 2024 – 2025

Data synchronization

35

U. Kaymak 35
5ARB0 Data acquisition and analysis 2024 – 2025

Missing values
MCAR: Missing Completely at Random
No systematic differences between missing and non‐
missing data

MAR: Missing at Random


Differences between missing and non‐missing data, which
can be explained entirely by non‐missing features

MNAR: Missing Not at Random


Missing with bias (e.g. value of variable explains missing
data)

36

U. Kaymak 36
5ARB0 Data acquisition and analysis 2024 – 2025

Handling Missing Values

https://hrngok.github.io/posts/missing%20values/

37

U. Kaymak 37
5ARB0 Data acquisition and analysis 2024 – 2025

Example

38

U. Kaymak 38
5ARB0 Data acquisition and analysis 2024 – 2025

Tremor assessment

• Tremor: rhythmic and involuntary movements of any body part,


resulting from a neurological disorder

• Most common neurodegenerative disease with tremor:


‐ Parkinson’s disease – 1 million in US, 75% tremor
‐ Essential tremor – 10 million in US

39

U. Kaymak 39
5ARB0 Data acquisition and analysis 2024 – 2025

Tremor symptoms

Parkinson’s disease Essential Tremor


Tremor types Resting Postural, kinetic
Frequency 4‐6 Hz 7‐12 Hz
Presence in hands More than 70% 95%

• Essential tremor is not life threatening, but disabling


• Treatment: Medication, or in severe cases Deep Brain Stimulation

40

U. Kaymak 40
5ARB0 Data acquisition and analysis 2024 – 2025

Tremor assessment

• Rating scales – most used, biased


• QUality of life in ESsential Tremor (QUEST)
‐ Filled in by patient
‐ e.g. “I have lost interest in my hobbies
because of tremor”
• Essential Tremor Rating Scale (ETRS)
‐ Filled in by clinician
‐ e.g. ‘‘Rest tremor severity of the head’’
• Computerized analyses – objective

41

U. Kaymak 41
5ARB0 Data acquisition and analysis 2024 – 2025

TREMOR12
• Open‐source smartphone application
• Created at Maastricht University Medical
Center, The Netherlands
• Sensors:
• Accelerometer: Acceleration (in g)
• Gyroscope: Rotation speed (in radians/s)
• Sampling rate: 100Hz

42

U. Kaymak 42
5ARB0 Data acquisition and analysis 2024 – 2025

Data Collection

• Five tests, both right and left wrist


• Rest: 1 test, 1 minute
• Postural: 2 test – 1 minute
• Kinetic: Glass & finger‐nose test, each
repeated 3 times

• 20 ET patients, ETRS and QUEST score


known

43

U. Kaymak 43
5ARB0 Data acquisition and analysis 2024 – 2025

Noise filtering

• Noise from tapping start and stop • Voluntary movement


button • Equiripple Finite Impulse
• First and last 50 samples omitted Response (FIR) filter 7 to 12 Hz

44

U. Kaymak 44
5ARB0 Data acquisition and analysis 2024 – 2025

Feature extraction and selection (more on this later)


• Extracted features for each test:
• Signal strength
‐ Root‐mean‐square
• Signal period
• Dominant magnitude
‐ Welch method
‐ Daubechies 8 wavelet method
• Dominant frequency
‐ Welch method
‐ Daubechies 8 wavelet method

• Sequential forward feature selection

45

U. Kaymak 45

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy