0% found this document useful (0 votes)

32 views11 pages

Glossary

Glosssary on the Data ecosystem

Uploaded by

eskurup

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views11 pages

Glossary

Glosssary on the Data ecosystem

Uploaded by

eskurup

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Glossary: The Data Ecosystem

Welcome! This alphabetized glossary contains many terms used in this course. Understanding these
terms is essential when working in the industry, participating in user groups, or participating in other
certificate programs.

Estimated reading time: 8 minutes

Term Definition

Apache Airflow An open-source workflow management platform for data engineering pipelines.

Apache Beam An open-source, unified programming model for batch and streaming data
processing pipelines.

Apache HBase A non-relational database that runs on Hadoop, providing real-time access to large
data sets.

Apache Kafka An open-source software platform used to handle real-time data feeds.

Apache Storm A framework for distributed stream processing computation primarily written in the
Clojure programming language.

Apache Spark Streaming An extension of the core Spark API that allows for fault-tolerant
stream processing of live data streams with high throughput and scalability.

Atomicity, consistency, isolation, and durability (ACID) compliance A group of characteristics

that ensure dependable and uniform processing of transactions in a database system.

BeautifulSoup A Python library to get data out of HTML, XML, and other markup languages.

Big data stores A larger, more complex data set, especially from new data sources.

Big dataA dynamic, large, and disparate volume of data being created by people, tools, and
machines.

Cloudant A fully managed, distributed database optimized for heavy workloads and fast-
growing web and mobile apps.

Comma-separated values (CSV) A text-formatted file uses commas to separate the values.

Conceptual data model The model, created by business stakeholders and data architects, defines the
system's scope, concepts, and rules.

CouchDB An open-source NoSQL document database that collects and stores data in JSON-
based document formats. Unlike relational databases, CouchDB uses a schema-free data model,
which simplifies record management across various computing devices, mobile phones, and web
browsers.

Customer relationship management (CRM) software Software that helps companies measure and
control their lead generation and sales pipelines.

Data abstraction The process of simplifying a set of data to represent the whole.
Data analyst A data professional who first gathers and understands the data, then analyzes and
interprets it before visualizing it and, finally, weaving it into a story.

Data analytics Focuses on extracting valuable information from data using various tools,
techniques, processes, and algorithms. It includes data analysis and the interpretation of the results,
keeping in mind specific business objectives.

Data fabric An architecture that facilitates the end-to-end integration of various data pipelines
and cloud environments through intelligent and automated systems.

Data integration The combination of technical and business processes that are used to
combine data from disparate sources into meaningful and valuable information.

Data lakes A centralized repository designed to store, process, and secure large amounts of
structured, semistructured, and unstructured data. It can store data in its native format and process
any variety, ignoring size limits.

Data lookup A way to fill in information based on rules.

Data marts Data warehouses are segmented into smaller subsets, known as data marts. These
data marts are designed to manage specific business functions, departments, or subject areas. By
doing so, data marts make it easier for a defined group of users to access specific data, enabling
them to quickly find crucial insights without wasting time searching through an entire data
warehouse.

Data modeling Creating a visual representation of either a whole information system or parts of it to
communicate connections between data points and structures.D

Data repository Data sets isolated to be mined for reporting and analysis. It is also known as a data
archive or library.

Data science Process that focuses on understanding the data. This involves data analysis,
beginning with data loading, exploring, and cleaning. It creatively explores data, coming up with new
solutions and inventions.

Data source The physical or digital location where the data is held in a data table, object, or other
storage format.

Data streams The process of transmitting continuous data and feeding it into stream processing
software to derive valuable insights.

Data visualization The graphical representation of information and data. It helps data
visualization to understand trends, outliers, and patterns in data.

Data warehouses A storage architecture that pulls data from many sources into a single data
repository for sophisticated analytics and decision support.

Database as a service A cloud-computing service that allows users to access and use a cloud
database system without purchasing and setting up their own hardware, installing their own
database software, or managing the database themselves.

Database Management System (DBMS) Software to store and retrieve users' data by considering the
security of their information.
DenodoA unified virtual data layer that allows enterprise users to access data across formats,
protocols, and locations using techniques like search.

DocumentDB A NoSQL database service that supports document data structures with some
MongoDB 3.6 and 4.0 compatibility.

DynamoDB A type of database developed by Amazon Web Services (AWS).

Enterprise resource planning (ERP) systems A type of software system that enables businesses
to automate and efficiently manage their key business processes to gain optimal performance.

Entity-relationship model (E-R model) A high-level data model is created to define the data
elements and their relationships for a specific system. It develops a conceptual design for the
database and presents a simple and easy-to-design data view.

Extract, load, transform (ETL) process A process that extracts, loads, and transforms data from
multiple sources to a data warehouse or other unified data repository.

Flat files Collection of data that is stored specifically in a two-dimensional database. It usually
contains a series of records (or lines), where each record is usually a sequence of fields.

Global Positioning Systems (GPS) A radio navigation system that accurately determines
location, time, and velocity regardless of weather conditions.

Hadoop Distributed File System (HDFS) A storage system for big data that runs on multiple
commodity hardware devices connected through a network. HDFS provides scalable and reliable big
data storage by partitioning files over multiple nodes.

Hierarchical model A data model in which the data are organized into a tree-like structure.

Hive A data warehouse for data query and analysis built on top of Hadoop.

HadoopA collection of tools that provides distributed storage and processing of big data.

Java A programming language known for its platform independence, which allows Java programs
to run on different operating systems without modification.

JavaScript object notation (JSON) An open standard file format that uses readable text to store
and transmit data objects consisting of attributes.

Linux An open-source operating system developed from Unix.

Logical data model Provides detailed descriptions of data elements and is utilized to create
visual representations of data entities, attributes, keys, and relationships.

MongoDB An open-source, nonrelational database management system (DBMS) that uses

flexible documents instead of tables and rows to process and store various forms of data.

MySQL An open-source relational database management system (RDBMS).

Network model A database model conceived as a flexible way of representing objects and their
relationships.

NoSQL database A non-tabular database that stores data with different data storage tables
than relational tables.
Online analytical processing (OLAP) Software that is used to conduct multidimensional analysis
on large volumes of data from a data warehouse, data mart, or other centralized data store.

Online Transaction Processing (OLTP) A computerized system that allows real-time data processing
and immediate response to users' queries.

Oracle Cloud A cloud platform that offers complete cloud application suites across software as a
service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS).

Oracle databaseA multi-model database management system generally used for online transaction
processing (OLTP), data warehousing, and both workloads.

Pandas A Python library used to work with data sets.

Physical data model A database-specific model that represents relational data objects (for
example, tables, columns, primary and foreign keys) and their relationships.

Platform as a service A cloud computing model that provides customers a complete cloud
platform, hardware, software, and infrastructure, for developing, running, and managing applications
without the cost, complexity, and inflexibility that often come with building and maintaining that
platform on-premises.

PostgreSQL An open-source database that has a strong reputation for its reliability, flexibility, and
support of open technical standards.

PowerShell A cross-platform command-line shell and scripting language designed for automating
tasks and managing configurations.

Python An agile, dynamically typed, expressive, open-source programming language that supports
multiple programming philosophies, including procedural, object-oriented, and functional. Python is
a popular high-level programming language that is easily extensible through the use of third-party
packages and often allows powerful functions to be written with a few lines of code.

Radio Frequency Identification (RFID) tags A method for tracking goods through their tags.

Relational Database Service (RDS) Organizes data into rows and columns, which collectively
form a table. Data is typically structured across multiple tables, which can be joined together via a
primary key or a foreign key.

Relational model An approach to managing data using a structure and language consistent
with first-order predicate logic.

Scala A programming language designed for concise, elegant, and type-safe expression of
programming patterns. This language seamlessly integrates object-oriented and functional features.

Scrapy A free and open-source web-crawling framework written in Python and developed in
Cambuslang.

Selenium A testing platform for an open-source web user interface.

Spark A distributed data analytics framework designed to perform complex data analytics in real-
time.

SQL Computer language used to interact with a relational database.

SQL database A collection of highly structured tables where each row represents a data entity and
every column represents a specific information field.

Statistical Analysis System (SAS) A programming language that provides all the tools necessary to
read, write, and create system files, SAS databases, and reports.

Structured data The data that conforms to a defined structure follows a consistent order and is easily
accessible to people or computer programs.

Tab-separated values (TSV) A text-based file format that stores data.

Talend Open Studio A free, open-source ETL tool for data integration and big data.

Unix A group of multitasking, multiuser computer operating system.

Unstructured data Typically categorized as qualitative data that cannot be processed and
analyzed via conventional data tools and methods.

VelocityA tool to provide insights to the business about how well software delivery is working and
where to focus new processes, resources, or more automation.

VeracityThe term "Veracity" was coined by IBM to describe the challenges of managing data from
disparate sources, which can be inconsistent and unreliable.

Web scraping A technique used to collect online content and data generally gets saved in a local
file so as to manipulate and analyze as needed.

Differences between the four types of analytics

Estimated reading time: 2 minutes

Learning Objective:

Define and distinguish between descriptive, diagnostic, predictive, and prescriptive analytics.

You have learned about the four types of analytics: descriptive analytics, diagnostic analytics,
predictive analytics, and prescriptive analytics. Now, let's look at the differences between them.

Descriptive analytics

Descriptive analytics is the first and most basic step in business intelligence. It involves summarizing
raw data using data mining and aggregation techniques, such as measures of distribution (frequency
or count), measures of central tendency (mean, median, mode), and measures of variability
(variance and standard deviation) to reveal trends. For instance, it helps you identify the customer
segment responsible for generating the highest revenue for your product.
Diagnostic analytics

Moving on, diagnostic analytics delves into the "why." It analyzes the data and correlates it with
other data sets to uncover the underlying reasons for trends identified through descriptive analytics.
Some techniques include probability theory, time-series analysis, filtering, and regression analysis.
For example, diagnostic analytics identifies specific features of the customer segment that contribute
to their product purchases.

Predictive analytics

Predictive analytics uses statistical modeling, data mining, and machine learning to analyze large
volumes of data to forecast what will happen in the future. It uses historical data to calculate
upcoming trends. For example, predictive analytics can project the expected demand each customer
segment will generate in the upcoming quarter.

Prescriptive analytics

Lastly, prescriptive analytics goes beyond descriptive, diagnostic, and predictive analytics to
recommend the next course of action based on predictions. It harnesses data optimization,
simulation, and decision analysis methods using artificial intelligence or machine learning techniques
to determine the best solution considering several data points. Prescriptive analytics can recommend
the optimal price and marketing strategies based on the forecasted demand and other contributing
factors.

The objectives of the four types of analytics, the techniques used in each case, and examples of their
applications are summarized in the table below.

Table 1: Differences between the four types of analytics

Descriptive Analytics Diagnostic Analysis Predictive Analytics Prescriptive Analytics

Key objectives To present what happened in an actionable format To delve into why
something happened To forecast what will happen To recommend the next course of action

Techniques Data mining and aggregation, such as distribution (frequency or count), measures of
central tendency (mean, median, mode), and measures of variability (variance and standard
deviation) Correlation with other data sets using techniques like probability theory, time-series
analysis, filtering, and regression analysis Statistical modeling, data mining, and machine
learning Data optimization, simulation, and decision analysis using artificial intelligence or
machine learning techniques

Example Identifies the highest revenue-generating customer segment Identifies the

specific features that contribute to the purchases Projects expected demand for the next
quarter Recommends the optimal price and marketing strategies
Summary

In this reading, you have learned the differences between descriptive analytics, diagnostic analytics,
predictive analytics, and prescriptive analytics.

Descriptive analytics summarizes raw data to reveal trends.

Diagnostic analytics correlates the data with other data sets to find the reason for the trends.

Predictive analytics uses historical data to predict trends.

Prescriptive data goes beyond descriptive, diagnostic, and predictive analytics to recommend the
next course of action based on various data points.

Learning Objective:

Describe the key business intelligence or BI components that make up its process.

You have learned about the key components of a business intelligence (BI) system, including data
sources and integration, data warehousing, data analysis and mining, reporting and visualization
systems, software technologies, and advanced analytics. While these form the essential architecture
of the BI system, there are other factors that contribute to the success of a BI system. Let's delve into
what comprises the BI ecosystem.

The BI ecosystem or BI environment comprises four key elements: data, people, processes, and
technologies. While creating a BI strategy, one should consider all four elements. Let's deep dive into
each of these elements and understand their role in the BI ecosystem.

Data

Data is the most important element of the ecosystem, as it is the raw material for analytics and
reporting. Data can originate from both internal and external sources. Internal data includes human
resources data, customer data, financial data, and website-related data. External data includes
publicly available sources such as market trends, customer demographics, and financial trends.
Collecting the right data is crucial to getting actionable insights that help decision-making.

People

People involved in BI include data analysts responsible for sourcing and processing data and users
who query the data to generate the required reports to make informed decisions. Each individual
involved must have the skills required for their role.

Processes
The processes in the BI system must be designed to fulfil the requirements of the business and
produce accurate results. The BI architecture typically consists of data collection, data integration
and management, data analysis, and data reporting and visualization processes.

Technologies

Technologies run the BI processes. The technology stack for BI must be carefully chosen to align with
the purpose of the BI system and always be kept up to date. It should be capable of handling high
volumes of complex data. Components such as data mining, extract, transform, and load (ETL),
analytics, and reporting software should be seamlessly integrated to produce the intended
outcomes.

Summary

In this reading, you have learned that

The BI ecosystem or environment comprises four key elements: data, people, processes, and
technologies.

Data from both internal and external sources form the raw material for analytics and reporting.

All the people involved, individuals responsible for creating and maintaining the system, and those
using it are key to making the BI system work.

Processes need to be designed to cater to the requirements of the business and yield accurate
results.

The technologies selected must align with the BI system's purpose and must be kept up to date.

(Optional) Some Dashboard and Visualization Tools

Estimated reading time: 3 minutes

Learning Objective:

Evaluate and compare different business intelligence tools and technologies used in the BI analyst
ecosystem.

You have learned about the categories of tools used in a BI system. One of the most popular types of
tools is a dashboard and visualization tool. These tools seamlessly integrate with the broader BI
architecture to enable advanced analytics, data visualization, and reporting functions.

We'll now delve into the features, pros, and cons of four of these tools: IBM Cognos Analytics,
Tableau, Power BI, and Looker.
IBM Cognos Analytics

Cognos Analytics by IBM is cloud-based BI software powered by AI.

Features:

Self-service and geospatial capabilities

Seamless access on a desktop or mobile device

Advanced analytics, customization, scalable distribution, and scheduling abilities to meet business
goals

Pros:

• Preferred by cloud users

• High scalability and customization capabilities

• Built-in AI features to accelerate and improve blending data

Cons:

Lacks comprehensive dashboard features

Requires a steep learning curve to learn about all the features

Comes at a high price

Has additional paid features

Tableau

Tableau is a BI tool that has exceptional data visualization capabilities.

Features:

User-friendly interface that allows users to create intuitive visualizations and interactive dashboards

Drag-and-drop functionality that makes it easy for users to explore data and gain insights

Advanced analytics capabilities that support complex calculations, statistical analysis, and forecasting

Built-in functions and integration with R and Python for advanced analytics

Seamless integration capabilities with various data sources, including databases, spreadsheets, cloud
services, and web connectors

Pros:

Extensive library of pre-built visualizations and interactive elements

A vast repository of resources

Community support

Cons:

Limited data preparation capabilities and thus relies on other tools

Needs a strong IT team due to its dependency on other tools

Expensive licensing options making it less accessible for small businesses

Steep learning curve for navigating the advanced features and complex data models

Power BI

Power BI is BI software developed by Microsoft with specialized data visualization capabilities.

Features:

Seamless integration with Microsoft products such as Excel, SharePoint, Teams, Power Automate,
and Azure

Ability to leverage existing data sources and collaborate within the Microsoft ecosystem

Question-and-answer feature enabling interaction using natural language to get instant answers
through visualizations

Access to reports and dashboards on the go through Power BI Mobile apps for iOS and Android
devices

Collection of pre-trained machine learning models enhancing your data preparation efforts

Pros:

Popular with Microsoft Office users

Robust sharing and collaboration capabilities with access control policies

Cheap licensing costs attracting small and medium businesses

Cons:

Limited customization and data modeling capabilities

Premium subscriptions for some advanced features

Looker

Looker is a cloud-based self-service data visualization and exploration tool from Google.

Features:

Cloud-based tool

Powerful data modeling layer allowing users to define relationships between different datasets and
create reusable metrics and dimensions
Data visualization integrations into applications or websites

Granular access controls, auditing capabilities, and integration with single sign-on providers, thus
securing sensitive data and making it accessible only to authorized users

Pros:

Favored by companies that have invested in cloud infrastructure

Model customization and complex calculations through LookML, or Looker Modeling language

Capability to extend functionality as needed through comprehensive APIs

Detailed user-interaction history through granular access control and auditing capabilities

Cons:

Need for SQL and database expertise

Long render times for large data sets

No budget-friendly pricing plan

Summary

In this reading, you have learned about the features, pros, and cons of four dashboard and
visualization tools.

IBM Cognos Analytics is a cloud-based AI-powered tool with self-service and geospatial capabilities.

Tableau's user-friendly interface allows users to create intuitive visualizations and interactive
dashboards.

Power BI seamlessly integrates with the Microsoft ecosystem and is popular among MS Office users.

Looker is a cloud-based tool that allows robust customization and the addition of complex
calculations.

Data Engineering Quick Reference
No ratings yet
Data Engineering Quick Reference
9 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Module 1 Glossary What Is Big Data
No ratings yet
Module 1 Glossary What Is Big Data
2 pages
Data Glossary - Michael Dillon
No ratings yet
Data Glossary - Michael Dillon
11 pages
Course1 Summary
No ratings yet
Course1 Summary
4 pages
Data Engineering - Part 2
No ratings yet
Data Engineering - Part 2
22 pages
A Guide For Beginners: Big Data Glossary
No ratings yet
A Guide For Beginners: Big Data Glossary
1 page
SQL Demystified: A Beginner's Roadmap to Data Retrieval and Management
From Everand
SQL Demystified: A Beginner's Roadmap to Data Retrieval and Management
Kaushal Mehta
No ratings yet
Big Data Question
No ratings yet
Big Data Question
24 pages
Tools in Data Analytics
No ratings yet
Tools in Data Analytics
17 pages
Basic Terms of DATA ENGINEERING
No ratings yet
Basic Terms of DATA ENGINEERING
9 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Business Intelligence Notes
No ratings yet
Business Intelligence Notes
27 pages
Big Data Analytics 1-5
100% (1)
Big Data Analytics 1-5
63 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
24 pages
BDA Module-1
No ratings yet
BDA Module-1
9 pages
Grossary 6
No ratings yet
Grossary 6
7 pages
Chapter Two Data Science: by Abdulaziz Oumer
No ratings yet
Chapter Two Data Science: by Abdulaziz Oumer
29 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Session 1
No ratings yet
Session 1
48 pages
Fda 1
No ratings yet
Fda 1
5 pages
Biggdata
No ratings yet
Biggdata
24 pages
Open Source Technologies
No ratings yet
Open Source Technologies
19 pages
fKRnNEEKSXed0KQFCUv8XQ Glossary
No ratings yet
fKRnNEEKSXed0KQFCUv8XQ Glossary
8 pages
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
Data Science Glossary
No ratings yet
Data Science Glossary
9 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Module 1
No ratings yet
Module 1
29 pages
Data Science
No ratings yet
Data Science
87 pages
Data Engg
No ratings yet
Data Engg
19 pages
BD Unit 1,2
No ratings yet
BD Unit 1,2
12 pages
Data Mining 1
No ratings yet
Data Mining 1
13 pages
Data Analytics and Modelling: Amynah Reimoo
No ratings yet
Data Analytics and Modelling: Amynah Reimoo
9 pages
1 - Big Data Analytics & IoT
No ratings yet
1 - Big Data Analytics & IoT
13 pages
Booklet 1713343657
No ratings yet
Booklet 1713343657
79 pages
UNIT IV - Iot - 1
No ratings yet
UNIT IV - Iot - 1
27 pages
Data Engineering Part 1 1735286787
No ratings yet
Data Engineering Part 1 1735286787
22 pages
Data Camp Lexicon
No ratings yet
Data Camp Lexicon
2 pages
Big Data Unit 1 Notes - 240311 - 100703
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
15 pages
CCD Chapter 3 Notes
No ratings yet
CCD Chapter 3 Notes
11 pages
STREAM PROCESSING 2 Marks Question and Answers
No ratings yet
STREAM PROCESSING 2 Marks Question and Answers
8 pages
Index: Mlbase Component, 100
No ratings yet
Index: Mlbase Component, 100
8 pages
Emerging Chapter 2
No ratings yet
Emerging Chapter 2
22 pages
Big Data Technologies UNIT 1
No ratings yet
Big Data Technologies UNIT 1
5 pages
Lect7 IoT BigData1
No ratings yet
Lect7 IoT BigData1
28 pages
Chapter 2-Data Science
No ratings yet
Chapter 2-Data Science
23 pages
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Big Data Deals With Large Data Sets
No ratings yet
Big Data Deals With Large Data Sets
4 pages
Types of Digital Data: Unit 1 Big Data KCS-061
No ratings yet
Types of Digital Data: Unit 1 Big Data KCS-061
12 pages
Course 1 Glossary
No ratings yet
Course 1 Glossary
6 pages
2 Emerging
No ratings yet
2 Emerging
10 pages
BDT Viva Questions
No ratings yet
BDT Viva Questions
2 pages
Cours BI 23 24 Session 4 2
No ratings yet
Cours BI 23 24 Session 4 2
46 pages
Lab Manual BDA
No ratings yet
Lab Manual BDA
36 pages
Unit 1
No ratings yet
Unit 1
17 pages
Ds Notes
No ratings yet
Ds Notes
88 pages
Cou2 Glossary
No ratings yet
Cou2 Glossary
6 pages
Cloud (Naan Mudhalvan)
No ratings yet
Cloud (Naan Mudhalvan)
13 pages
Data Curation and Managment Chap1-5 1-5
No ratings yet
Data Curation and Managment Chap1-5 1-5
31 pages
BigData Unit1
No ratings yet
BigData Unit1
74 pages
Myxovirus Infections of Respiratory Tract 2
No ratings yet
Myxovirus Infections of Respiratory Tract 2
29 pages
BSAC Disc Susceptibility Testing Method Jan 2015
No ratings yet
BSAC Disc Susceptibility Testing Method Jan 2015
37 pages
Measles and MUMPs
No ratings yet
Measles and MUMPs
5 pages
Hans Zimmer Vs Ludwig
No ratings yet
Hans Zimmer Vs Ludwig
11 pages
Paper 1 Mbbs Univ 2024 Set B
No ratings yet
Paper 1 Mbbs Univ 2024 Set B
3 pages
Annexure COI
No ratings yet
Annexure COI
1 page
CONFIDENTIALITY AGREEMENT Annexure 4
No ratings yet
CONFIDENTIALITY AGREEMENT Annexure 4
1 page
Paper 1 Mbbs Univ 2024
No ratings yet
Paper 1 Mbbs Univ 2024
3 pages
Confidentiality Agreement Form For Subject Experts
No ratings yet
Confidentiality Agreement Form For Subject Experts
1 page
M.SC SECOND Micro Paper - 1 Second y
No ratings yet
M.SC SECOND Micro Paper - 1 Second y
1 page
MS.C MedicalSECOND Y P-2
No ratings yet
MS.C MedicalSECOND Y P-2
1 page
PHD 2025a
No ratings yet
PHD 2025a
7 pages
Anita Bhandare
No ratings yet
Anita Bhandare
2 pages
10 +Exception+Handling
No ratings yet
10 +Exception+Handling
27 pages
Power BI Interview Questions 1717984677
No ratings yet
Power BI Interview Questions 1717984677
21 pages
Bank Management System V B
No ratings yet
Bank Management System V B
52 pages
Class 12 CS Practical QP 2023
No ratings yet
Class 12 CS Practical QP 2023
12 pages
Microsoft Selftestengine dp-900 Exam Dumps 2023-Jul-02 by Dennis 156q Vce
No ratings yet
Microsoft Selftestengine dp-900 Exam Dumps 2023-Jul-02 by Dennis 156q Vce
21 pages
Phillips - Automated Aggregation of Data For Asset Health Analysis
No ratings yet
Phillips - Automated Aggregation of Data For Asset Health Analysis
11 pages
Zinara Toll Manageme NT System Test Strategy Document: Group 16
No ratings yet
Zinara Toll Manageme NT System Test Strategy Document: Group 16
10 pages
CSE 2-2 Database Management Systems Lab Workbook
No ratings yet
CSE 2-2 Database Management Systems Lab Workbook
140 pages
Cs Project
No ratings yet
Cs Project
20 pages
Informatica Admin Interview Questions Answers
No ratings yet
Informatica Admin Interview Questions Answers
2 pages
CBSE IP Practical File 2015 (Java and MySQL)
66% (35)
CBSE IP Practical File 2015 (Java and MySQL)
38 pages
Pre Revision 2
No ratings yet
Pre Revision 2
14 pages
Test 44
No ratings yet
Test 44
213 pages
Execution of Select Statement
No ratings yet
Execution of Select Statement
4 pages
CH2 Quizlet
No ratings yet
CH2 Quizlet
4 pages
Unit - 5 Computer and Computerised Accounting System
No ratings yet
Unit - 5 Computer and Computerised Accounting System
7 pages
Course Contents Course Name: T.Y.B.B.A. (Ca) Semester: Vi Subject Code: 19babbcu601 Subject Name: Advanced Java Course Objectives
No ratings yet
Course Contents Course Name: T.Y.B.B.A. (Ca) Semester: Vi Subject Code: 19babbcu601 Subject Name: Advanced Java Course Objectives
3 pages
Sharayu Chaudhari - XC - 24 - IT Portfolio
No ratings yet
Sharayu Chaudhari - XC - 24 - IT Portfolio
34 pages
PID Manager User Guide
No ratings yet
PID Manager User Guide
156 pages
ARIS Server Administrator Command-Line Tool
No ratings yet
ARIS Server Administrator Command-Line Tool
20 pages
ComputerScience SQP
No ratings yet
ComputerScience SQP
10 pages
Iare DWDM PPT Cse
No ratings yet
Iare DWDM PPT Cse
249 pages
Delta Lake vs. Parquet. If Delta Lake Tables Also Use Parquet - by Abhinav Prakash - Jan, 2024 - Medium
No ratings yet
Delta Lake vs. Parquet. If Delta Lake Tables Also Use Parquet - by Abhinav Prakash - Jan, 2024 - Medium
13 pages
QA6
No ratings yet
QA6
8 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Software Requirement Specification (SRS) : Online Blood Bank Management System
No ratings yet
Software Requirement Specification (SRS) : Online Blood Bank Management System
29 pages
Guidelines For Preparing The Presentation Slides For Intro To Data Science and AI Project
No ratings yet
Guidelines For Preparing The Presentation Slides For Intro To Data Science and AI Project
2 pages
Cancer Detection
No ratings yet
Cancer Detection
8 pages
Informatics Practices-Class-11-2024-25
0% (1)
Informatics Practices-Class-11-2024-25
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Glossary

Uploaded by

Glossary

Uploaded by

Glossary: The Data Ecosystem

Estimated reading time: 8 minutes

Atomicity, consistency, isolation, and durability (ACID) compliance A group of characteristics

Data lookup A way to fill in information based on rules.

DynamoDB A type of database developed by Amazon Web Services (AWS).

Linux An open-source operating system developed from Unix.

MongoDB An open-source, nonrelational database management system (DBMS) that uses

MySQL An open-source relational database management system (RDBMS).

Pandas A Python library used to work with data sets.

Selenium A testing platform for an open-source web user interface.

SQL Computer language used to interact with a relational database.

Tab-separated values (TSV) A text-based file format that stores data.

Unix A group of multitasking, multiuser computer operating system.

Differences between the four types of analytics

Estimated reading time: 2 minutes

Table 1: Differences between the four types of analytics

Descriptive Analytics Diagnostic Analysis Predictive Analytics Prescriptive Analytics

Example Identifies the highest revenue-generating customer segment Identifies the

Descriptive analytics summarizes raw data to reveal trends.

Predictive analytics uses historical data to predict trends.

In this reading, you have learned that

(Optional) Some Dashboard and Visualization Tools

Estimated reading time: 3 minutes

Cognos Analytics by IBM is cloud-based BI software powered by AI.

Self-service and geospatial capabilities

Seamless access on a desktop or mobile device

• Preferred by cloud users

• High scalability and customization capabilities

• Built-in AI features to accelerate and improve blending data

Lacks comprehensive dashboard features

Requires a steep learning curve to learn about all the features

Comes at a high price

Has additional paid features

Tableau is a BI tool that has exceptional data visualization capabilities.

Extensive library of pre-built visualizations and interactive elements

A vast repository of resources

Limited data preparation capabilities and thus relies on other tools

Needs a strong IT team due to its dependency on other tools

Expensive licensing options making it less accessible for small businesses

Power BI is BI software developed by Microsoft with specialized data visualization capabilities.

Popular with Microsoft Office users

Robust sharing and collaboration capabilities with access control policies

Cheap licensing costs attracting small and medium businesses

Limited customization and data modeling capabilities

Premium subscriptions for some advanced features

Favored by companies that have invested in cloud infrastructure

Capability to extend functionality as needed through comprehensive APIs

Need for SQL and database expertise

Long render times for large data sets

No budget-friendly pricing plan

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.