We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
27/02/2024, 18:58 about:blank
Data Literacy for Data Science Lesson Glossary
Welcome! This alphabetized glossary contains many of the terms in this course. These terms are important for you to recognize when working in the industry, participating in user groups, and participating in other certificate programs.
Video where the
Term Definition term is introduced Ensuring data accuracy and consistency through Atomicity, Relational Database ACID-compliance Consistency, Isolation, and Durability (ACID) in database Management System transactions. Cloud-based Cloud-hosted integration platforms that offer integration services Data Integration Integration Platform through virtual private clouds or hybrid cloud models, providing Platforms as a Service (iPaaS) scalability and flexibility. A type of NoSQL database that organizes data in cells grouped Column-based as columns, often used for systems requiring high write request NoSQL Database volume and storage of time-series or IoT data. Data that is stored and not actively in motion, typically residing Considerations for Data at rest in a database or storage system for various purposes, including Choice of Data backup. Repository A discipline involving practices, architectural techniques, and tools that enable organizations to ingest, transform, combine, Data Integration Data integration and provision data across various data types, used for purposes Platforms such as data consistency, master data management, data sharing, and data migration. A data repository for storing large volumes of structured, semi- Data Marts, Data Data Lake structured, and unstructured data in its native format, facilitating Lakes, ETL, and agile data exploration and analysis. Data Pipelines A subset of a data warehouse designed for specific business Data Marts, Data Data mart functions or user communities, providing isolated security and Lakes, ETL, and performance for focused analytics. Data Pipelines A comprehensive data movement process that covers the entire Data Integration Data pipeline journey of data from source systems to destination systems, Platforms which includes data integration as a key component. A general term referring to data that has been collected, Data Collection and Data repository organized, and isolated for business operations or data analysis. Organization It can include databases, data warehouses, and big data stores. A central repository that consolidates data from various sources Data Collection and Data warehouse through the Extract, Transform, and Load (ETL) process, Organization making it accessible for analytics and business intelligence. A type of NoSQL database that stores each record and its Document-based associated data within a single document, allowing flexible NoSQL Database indexing, ad hoc queries, and analytics over collections of documents. The Extract, Transform, and Load process for data integration Data Marts, Data ETL process involves extracting data from various sources, transforming it Lakes, ETL, and into a usable format, and loading it into a repository. Data Pipelines A type of NoSQL database that uses a graphical model to Graph-based represent and store data, ideal for visualizing, analyzing, and NoSQL Database discovering connections between interconnected data points. A type of NoSQL database where data is stored as key-value Key-value store pairs, with the key serving as a unique identifier and the value NoSQL containing data, which can be simple or complex. about:blank 1/2 27/02/2024, 18:58 about:blank
Video where the
Term Definition term is introduced The capability of data integration tools to be used in various Data Integration Portability environments, including single-cloud, multi-cloud, or hybrid- Platforms cloud scenarios, provides flexibility in deployment options. Cataloged connectors and adapters that simplify connecting and building integration flows with diverse data sources like Data Integration Pre-built connectors databases, flat files, social media, APIs, CRM, and ERP Platforms applications. Relational databases Databases that organize data into a tabular format with rows and Data Collection and (RDBMSes) columns, following a well-defined structure and schema. Organization The ability of a data repository to grow and expand its capacity Considerations for Scalability to handle increasing data volumes and workload demands over Choice of Data time. Repository The predefined structure that describes the organization and Considerations for Schema format of data within a database, indicating the types of data Choice of Data allowed and their relationships. Repository Data that is continuously generated and transmitted in real-time Considerations for Streaming data requires specialized handling and processing to capture and Choice of Data analyze. Repository Applications such as Online Transaction Processing (OLTP), Use cases for Relational Database Data Warehouses (OLAP), and IoT solutions where relational relational databases Management System databases excel. A situation where a user becomes dependent on a specific Data Integration Vendor lock-in vendor’s technologies and solutions, making it challenging to Platforms switch to other platforms.