Data in Enterprise End Term Cheat Sheet
Data in Enterprise End Term Cheat Sheet
Compiled by: Tanisha Khandelwal, Aryan Patel, Suzy Paladiya, Yashvi Patel, Neha Bhansali & Vanshika Modi
Module 1
1. What is data? Explain structured and unstructured data types with examples
Answer - Data refers to information in the form of facts, statistics, or raw observations. There are
2 types of data structured and unstructured. Structured data is data that has been predefined and
formatted to a set structure before being placed in data storage and unstructured data is data
stored in its native format and not processed until used. The example of structured data would be
relational databases and for unstructured data it would be media and entertainment data.
Supervised Learning:
Deals with unlabeled data, aiming to find patterns, structures, or groupings within the data.
Clustering and dimensionality reduction are common techniques.
Discretization is used to divide the range of a continuous attribute into intervals and then n
Interval labels can then be used to replace actual data values
Discretization is used to divide the range of a continuous attribute into intervals and then n
Interval labels can then be used to replace actual data values
● Cleaning Data:
Example: Replacing missing values with a default value or removing rows with missing
data.
● Transforming Data:
Example: Converting data types, such as converting a string date to a datetime object.
Identify and handle missing values, which can be represented as NaN, NULL, or other
placeholders.
Options include removing rows with missing data, imputing missing values with means,
medians, or modes, or using more advanced imputation techniques.
● Removing Duplicates:
Detect and handle outliers in the data, either by removing them or transforming them.
● Standardizing Data:
Transformations typically involve converting a raw data source into a cleansed, validated and
ready-to-use format. Data transformation is crucial to data management processes that include
data integration, data migration, data warehousing and data preparation.
It is a critical component for any organization seeking to leverage its data to generate timely
business insights.
As the volume of data has proliferated, organizations must have an efficient way to harness data
to effectively put it to business use. Data transformation is one element of harnessing this data,
because -- when done properly -- it ensures data is easy to access, consistent, secure and
ultimately trusted by the intended business users
7. Draw the block diagram to show the data hierarchy in data governance
Data Governance:
Data Management:
Module 4
1. What is Data ethics in context with data in an enterprise?
Ans)
1. Responsible Data Use
Adherence to ethical guidelines and principles governing the responsible collection, processing,
and use of data
within the organization.
2. Privacy and Confidentiality
Ensuring the protection of individuals' privacy rights and sensitive information through proper
data anonymization, encryption, and access controls.
3. Transparency and Accountability
- Promoting transparency by clearly communicating data practices, policies, and purposes to
stakeholders.
- Holding individuals and departments accountable for ethical data handling and
decision-making.
4. Fairness and Impartiality
- Avoiding biases in data collection, analysis, and decision-making to ensure fairness and
impartiality in outcomes.
- Mitigating algorithmic biases to prevent discrimination in automated decision systems.
5. Informed Consent and Control
- Obtaining informed consent from individuals regarding data collection, use, and sharing,
providing individuals
control over their data.
6. Compliance with Regulations
- Abiding by legal and regulatory frameworks related to data protection, privacy, and security
(e.g., GDPR, HIPAA, CCPA).
7. Ethical AI and Machine Learning- Integrating ethical considerations into the development and
deployment of AI and machine learning models,
addressing issues of bias, fairness, and interpretability.
8. Continuous Education and Improvement
- Providing ongoing training and education to employees regarding ethical data practices and
evolving ethical standards.
- Regularly reviewing and updating policies and procedures to align with emerging ethical
challenges and changes in regulations.