0% found this document useful (0 votes)
27 views28 pages

A Simple PPT With Data Concepts 1703019257

Uploaded by

g0bithaasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views28 pages

A Simple PPT With Data Concepts 1703019257

Uploaded by

g0bithaasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Data analytics

From data to knowledge


• Data: Symbols (numbers, letters, etc) representing something from
the universe
• Information: Classified data in a form useful to answer questions,
make decisions, etc.
• Knowledge: Data made decisions
• Wisdom: The best decisions, analyzed decisions made
Types of analysis
Type
• Descriptive: What happens?
• Diagnostic: Why did X occur?
• Predictive: What will happen?
• Prescriptive: What should be done?
Type
• Descriptive: What happens?
• EDA: Exploratory Data Analysis
• BI: Business Intelligence
• Diagnostic: Why did X occur?
• Data minning
• Predictive: What will happen?
• Machine learning
• Prescriptive: What should be done?
• Optimization
• Rule-based approach
• Reinforcement learning
Exploratory Data Analysis (EDA)
• First approach to data analysis
• Statistical sampling and data quality
• Histogram, quartiles, percentiles
• Nulls, zeros, duplicates
• Charts are usually created and metrics are taken for each entity or
data set.
• Relationships between the data are analyzed: correlations, others,
etc.
• If there is a lot of data, a profiling can be done with a sample
• Automated tools are available: e.g. ydata-profiling, sweetviz, etc.
Business Intelligence (BI)
• Tools to create graphs, Dashboards and PPTs with graphs with data
• They usually use a multidimensional cube data structure, but can be
used with tabular data and the tool builds the cubes internally.
Data mining
• Tools for finding structural patterns in data
• Clustering tools
• Pattern search
• Sequence patterns
• Other patterns within the data
Machine Learning
• Techniques based on having a mathematical model with "gaps" or
free spaces (variables or parameters), and through a process of
optimization and probability they are filled.
• There are 2 types:
• Supervised: The objective is to predict one or more categories or numbers,
e.g., predict a price or grade.
• Unsupervised: The objective is to predict the mathematical function that
generates the data, i.e. a mathematical model that "explains" the data, e.g.
Clustering or Grouping, or generative AIs.
Deep Learning

• A type of Machine Learning that uses


models based on "Neural Networks",
which are nodes connected in different
ways to each other. Each node
represents a mathematical operation
such as a sum of input values multiplied
by a weight, and has an output.
• Depending on the structure and form of
the connections, they are divided into
different types (convolutional,
recurrent, etc.).
Optimization
• Algorithms that, given a cost function, attempt to obtain the optimal
combination of variables such that the cost is the lowest/largest
• They can be analytical, and iterative, based on mathematical or
biological systems.
• The most famous algorithms are:
• Downward Gradient
• Brute force
• Ant colony
• Particle swarm

https://youtu.be/M39fO13CALY
Reinforcement learning
• Technique that draws on machine learning / Deep learning, and
optimization.
• The objective is for an "actor" to execute the most optimal "action",
given a "scenario".
• It is used in robotics, to make robots perform
• Also used in autonomous cars
• Famous Algorithms: Q-Learning
Types of Deep Learning networks
• Neural networks are classified according to their connection scheme
between nodes. From this, topologies and "Architectures" are
created.
• DNN: Directed/Dense Neural Network, the most basic, all nodes in
one layer are connected to all nodes in the next layer.
• CNN: Convolutional, nodes are connected by mimicking the local
view, through convolutional layers and filters.
• RNN: Recurrent, have nodes that go "back in time" storing memory of
previous data, useful for time series.
Most famous architectures in Deep Learning
• Transformers: Network that keeps an attention weight between each
data in a time series. They are used for "Sequence by sequence" tasks
in text generation and translation, music generation, etc.
• U-net: Convolutional network with a decoder and an encoder, for
image segmentation or image generation (Diffusers networks usually
use u-net).
• Autoencoder: Identical input and output, used to compress
information and discard noisy data / outliers.
Where to store data
Data models
• Conceptual
• Logics
• Physical
Data models
• Conceptual
• also known as domain models, provide a general picture of what the system
will contain, how it will be organized and what business rules apply.
Conceptual models are typically created as part of the initial requirements
gathering process for the project. They typically include entity classes (which
define the types of things that are important to represent the business in the
data model), their characteristics and constraints, the relationships between
them, and the relevant data security and integrity requirements.
• Logics
• Physical
Data models
• Conceptual
• Logics
• are less abstract and provide more details about the concepts and
relationships in the domain under consideration. One of the formal data
modeling notation systems is followed. These indicate data attributes, such as
data types and their corresponding lengths, and show the relationships
between entities. Logical data models do not specify any technical system
requirements.
• Physical
Data models
• Conceptual
• Logics
• Physical
• are less abstract and provide more details about the concepts and
relationships in the domain under consideration. One of the formal data
modeling notation systems is followed. These indicate data attributes, such as
data types and their corresponding lengths, and show the relationships
between entities. Logical data models do not specify any technical system
requirements.
Work frames
• TOGAF
• Business architecture, which defines the organizational structure, business strategy
and processes of the company.
• Data architecture, which describes the physical, logical and conceptual data assets
and how they are stored and managed throughout the lifecycle.
• Application architecture, which represents the application systems, and how they
relate to the main business processes and to each other.
• Technical architecture, which describes the technological infrastructure (hardware,
software and networks) required to support the most important applications.

• DAMA
• Zachman
Work frames
• TOGAF
• Iterate through the different phases

• DAMA
• Zachman
Work frames
• TOGAF
• DAMA
• DAMA International, originally founded as Data Management Association
International, is a non-profit organization dedicated to the advancement of
data and information management. Its data management body of knowledge,
DAMA-DMBOK 2, covers data architecture, as well as data governance and
ethics, data modeling and design, storage, security, and integration.
• Zachman
Work frames
• TOGAF
• DAMA
• Prioritize dimensions
• Evaluate each dimension by
Each data process

• Zachman
Data management systems
• Data warehouse: Structured. Eg: Relational, Snowflake or Star model
• Data lake: Unstructured + Catalog
• Data lakehouse: Unstructured + Structured + Catalog + Governance
Data architectures
• Data fabric
• Centralized architecture
• The data are on a "path".
From the origins to
consumers of analytics

• Data mesh
Data architectures
• Data fabric
• Data mesh
Advantages of data management
• Reduction of redundancy
• Improved data quality
• Enabling integration
• Data lifecycle management
• Others

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy