AI Full Notes
AI Full Notes
Introduction to AI on Azure
Artificial intelligence is computer science technology that emphasizes creating intelligent machine that
can mimic human behavior.
Here intelligent machines can be defined as the machine that can behave like a human, think like a
human, and also capable of decision making.
It is made up of two words, "Artificial" and "Intelligence," which means the "man-made thinking
ability.
It provides a wide range of tools and services that developers and data scientists can use to build
intelligent applications, automate business processes, and gain insights from data.
Azure AI includes various services, each of which is designed to solve specific business problems,
Such as
• Azure Cognitive Services: Collection of pre-built APIs (Application programming interface) that
developers can use to add intelligent features to their applications, such as Natural language
processing, computer vision, and speech recognition.
• Azure Bot Services: It provides for building, deploying, and managing intelligent bots that can
communicate with users through a variety of channels, including web, mobile, and messaging apps.
• Azure Machine Learning: It provides complete set of tools and services for building, training and
deploying machine learning models. E.g.: Visual Studio
• Azure Data bricks: This is fast, easy, and collaborative Apache Spark-based analytics platform that
provides tools for data engineering, data science, and machine learning. It supports a range of
languages, including Python, R, and SQL, and offers built-in integrations with Azure services.
• Azure Speech Services: This is a suite of cloud-based speech-to-text, text-to-speech, and speech
translation APIs that enable developers to add intelligent voice capabilities to their applications.
Why Azure For AI
• Integrated Services: Azure offers a comprehensive suite of services for AI development, including pre-
built AI models, tools for building and training custom models, and services for deploying and scaling
AI solutions.
• Scale and Flexibility: Azure’s cloud-based infrastructure provides the ability to scale AI solutions as
needed, from small-scale experiments to large-scale production deployments. It also offers a flexible
range of compute options, allowing developers to choose the best infrastructure for their needs.
• Integration with other Microsoft tools: Azure integrates seamlessly with other Microsoft tools such as
Visual Studio, Power BI, and office 365, making it easy to incorporate AI into existing workflows.
• Security and Compliance: Azure has industry-leading security and compliance features including
robust data protection, encryption, and compliance certification such as HIPAA and GDPR.
• Language APIs: These APIs allow developers to perform natural language processing (NLP) tasks,
such as text analysis, sentiment analysis, and language detection.
• Vision APIs: These APIs provide computer vision capabilities, including image analysis, face
detection and recognition, and object detection.
• Speech APIs: These APIs enable speech recognition and synthesis (genuine form), as well as language
translation for both text and speech.
• Decision APIs: These APIs enable developers to build recommendation systems and personalized
experiences using machine learning.
• Search APIs: These APIs provide advanced search capabilities, including natural language search and
semantic search (E.g.: What is the largest mammal?” and then followed that question up with “How
big is it?” “it” refers to the largest mammal)
• Ease of Use: Azure Cognitive Services APIS are easy to use, and developers can get started quickly
with minimal AI expertise.
• Customization: Developers can customize the pre-built models provided by Azure Cognitive Services,
allowing them to fine-tune the models for specific use cases.
• Scalability: Azure Cognitive Services can scale to meet the needs of any application, from small-scale
prototypes to large-scale production deployments.
• Integration: Azure Cognitive Services integrates easily with other Azure services, as well as with third-
party tools and platforms.
Machine Learning
Machine learning is a branch of artificial intelligence that involves teaching computers to learn
from data and improve their performance on a specific task over time, without being explicitly programmed. It
is based on the idea that machines can learn patterns in data, and use those patterns to make predictions or
decisions.
(Types):
• Supervised Learning: This is a type of machine learning where the model is trained on labeled data,
meaning that the input data has already been categorized or labeled with the correct output. The goal
of supervised learning is to predict the correct output for new, unseen input data.
• Unsupervised Learning: This is a type of machine learning where the model is trained on unlabeled
data, meaning that the input data has been categorized or labeled with the correct output. The goal of
unsupervised learning is to discover patterns in the data and group similar data points together.
Reinforcement Learning: This is a type of machine learning where the model learns through trial and
error. It is based on the idea of an agent taking actions in an environment to maximize a reward, and
adjusting its behavior based on feedback from the environment.
• Automated Machine Learning: It provides automated machine learning capabilities that enable
developers to quickly and easily build high-quality machine learning models with minimal manual
effort.
• Pre-built Models: Azure Cognitive Services provides a suite of pre-built machine learning models for
common scenarios such as image recognition, natural language processing, and sentiment analysis.
• Wide range of tools and frameworks: Azure supports a wide range of open-source tools and
frameworks for machine learning, including Python, R, and Tensor Flow.
• Seamless integration: Azure integrates seamlessly with other Microsoft products, including Visual
Studio and Power BI, as well as popular third-party tools and frameworks.
• Scalability: Azure provides the ability to scale machine learning workloads up or down as needed,
allowing developers to handle large volumes of data and build models that can handle high traffic
applications.
• Model management and deployment: Azure provides tools for managing and deploying machine
learning models, including model versioning, model sharing, and automated model deployment.
• Monitoring and logging: Azure provides built-in tools for monitoring and logging machine learning
workloads, allowing developers to detect and diagnose issues quickly.
• Security and compliance: Azure provides robust security and compliance capabilities to protect data
and meet regulatory requirements.
• Azure Machine Learning Studio: This is a visual drag-and-drop tool that allows you to create, train,
and deploy machine learning models without writing any code. It also provides a range of pre-built
templates and algorithms for common use cases.
• Azure Machine Learning SDK for Python: This is a set of Python packages that allow you to create,
train, and deploy machine learning models using your own Python code. It also provides integration
with popular Python tools and frameworks like Tensor flow, PyTorch, and Scikit-learn.
• Azure Stream Analytics: This is a real-time data streaming and analytics service that enables
developers to build and deploy event-driven applications that incorporate machine learning
capabilities.
• Azure Synapse Analytics: This is an analytics service that provides a unified experience for big data
processing and analytics. It includes built-in machine learning capabilities for data preparation, model
training, and model deployment.
Computer vision applications use input from sensing devices, artificial intelligence, machine
learning and deep learning to replicate the way the human vision system works.
Computer vision applications run on algorithm that is trained on massive amounts of visual
data or images in the cloud.
2. Build and scale with managed Kubernetes (An Abstraction that represents a set of logical pods where
an application or component is running, as well as embedding an access policy to those pods).
3. Azure PlayFab. Everything you need to build and operate a live game on one platform.
4. Azure Operator Insights. Remove data silos and deliver business insights from massive datasets.
• Facial Recognition: Ride-sharing companies can use the Face service to enhance the security of riders
with selfies. They can periodically prompt drivers to take a selfie before they accept ride requests. This
helps ensure in real-time that the person behind the wheel matches the same person the company
screened and approved in their profile account.
• Recommendation: E-commerce or entertainment industries can improve user engagements by using
the Personalize service to understand their customers' preferences and making better recommendations
to them.
For gaming apps, the possible user options might be: "play a game," "watch a movie," or "Join
a clan." Based on that user's history and other contextual information (such as user location, time of
day, device type, or day of the week) the Personalize service helps business suggest the best option to
promote or recommend to the user.
• Conversational language: Finally, businesses can use the Language Understanding service to create
conversational bots or digital agents that allow users to interact with the bot applications using natural
language understanding. This custom natural language understanding service helps streamline work
processes and integrate with existing data to provide better customer service at scale to build brand
loyalty and a competitive advantage for businesses.
NLP is an important research because it enables computers to understand and process human
language, which is essential for many applications in business, healthcare, education, and more.
Advances in NLP have led to significant improvements in areas such as speech recognition, machine
translation, and automated customer service.
• Sentiment Analysis: Analyzing the emotional tone of written text or spoken language to understand
whether it is positive, negative, or neutral.
• Text Summarization: Creating a condensed version of a longer text, while preserving its key
Information.
• Named Entity Recognition: Identifying and categorizing entities in a text, such as people,
organizations, and locations.
• Chat bots and Conversational Interfaces: Using natural language to interact with a computer program
or device, such as a Virtual Assistant or Messaging App.
• Azure Cognitive Services Language APIs: This is a suite of pre-built NLP APIs that enable developers
to easily add intelligent features to their applications. These APIs include text analysis, sentiment
analysis, language detection, and entity recognition.
• Azure Text Analytics: This is a service that provides text analysis capabilities, including sentiment
analysis, key phrase extraction, and language detection. It can be used for a variety of applications,
such as social media monitoring, customer feedback analysis, and market research.
• Azure Translator: This is a service that provides real-time translation capabilities for text and speech in
over 60 languages. It can be used for applications such as multilingual customer support, global
collaboration, and international e-commerce.
• Azure Speech Services: This is a service that provides speech recognition and text-to- speech
capabilities. It can be used for applications such as voice assistants, interactive voice response systems,
and audio transcription
Knowledge Mining
Knowledge mining, also known as knowledge discovery or data mining is the process of
extracting useful information and insights from large datasets. It involves using a combination of
machine learning algorithms, statistical models, and data analysis techniques to identify patterns,
trends, and relationships within the data.
In the context of Microsoft Azure, Knowledge mining refers to the process of extracting
insights from unstructured data, such as text, images, and videos. Azure provides a range of services
and tools that enable organizations to perform knowledge mining at scale, including:
• Azure Cognitive Search: This is a cloud-based search service that enables organizations to index and
search unstructured data, such as documents, Images, and videos. It can be used for applications such
as enterprise search, e-commerce, and content management.
• Azure Cognitive Services: This is a suite of pre-built Al services that enable organization to easily add
intelligent features to their applications, including speech recognition, image analysis, and natural
language processing.
• Text analytics: Azure provides text analytics capabilities, including sentiment analysis, entity
recognition, and key phrase extraction. These services can be used to analyze large volumes of
unstructured text data, such as customer feedback, social media posts, and product reviews.
• Image and Video Analysis: Azure provides computer vision capabilities for analysing images and
videos, including object recognition, facial recognition, and optical character recognition (OCR).
These services can be used for applications such as content moderation, surveillance, and visual
search.
• Azure Cognitive Search: This is a cloud-based search service that enables organizations to Index and
search unstructured data, such as documents, images, and videos. It includes features such as face
navigation, synonym mapping, and natural language query processing.
• Azure Machine Learning: This is a cloud-based service that provides a range of tools and frameworks
for building, training, and deploying machine learning models. It includes features such as automated
machine learning, model interpretability, and model management.
• Integration with other Azure Services: Azure provides integration with other Azure services, such as
Azure Synapse Analytics, Azure Data bricks, and Azure Stream Analytics. This enables organizations
to build end-to-end data pipelines for knowledge mining and analysis.
Tune Model Hyper parameters
In machine learning, model hyper parameters are values that are set before training the model,
and they determine the behavior of the training algorithm. Tuning the hyper parameters involves
finding the optimal values for these parameters that result in the best performance of the model on the
validation data.
Microsoft Azure provides a range of tools and services for tuning model hyper parameters,
from automated machine learning to manual hyper parameter tuning using techniques such as grid
search and random search. With these tools, users can optimize the performance of their machine
learning models and achieve better results on their validation data.
• Azure Machine Learning: This is a cloud-based service that provides a range of tools and frameworks
for building, training, and deploying machine learning models. It includes automated machine
learning, which is a feature that enables users to automatically tune hyper parameters for their models
using a range of algorithms.
• Hyper parameter tuning in Azure Machine Learning: Azure Machine Learning also provides a range of
tools and techniques for manual hyper parameter tuning. Users can create experiments that test
different combinations of hyper parameters, and track the performance of the model on the validation
data.
• Azure Data bricks: This is a cloud-based platform for data engineering, machine learning, and
analytics. It includes hyper parameter tuning capabilities that allow users to tune model hyper
parameters using techniques such as grid search, random search, and Bayesian optimization (MAX and
MIN values).
• Azure Batch Al: This is a service that enables users to train machine learning models at scale using a
range of compute resources. It includes hyper parameter tuning capabilities that enable users to
automatically tune hyper parameters using techniques such as grid search and random search.
Configuring hyper parameter tuning in Microsoft Azure depends on the specific service or tool
that you are using. Configuring hyper parameter tuning in Microsoft Azure Involves
• Defining the hyper parameters to be tuned: Start by defining the hyper parameters that you want to
tune for your machine learning model. This will depend on the specific model that you are building
and the algorithm that you are using.
• Choosing the hyper parameter tuning method: Depending on the Azure service or tool that you are
using, there may be different methods for hyper parameter tuning available. Some of the common
methods include grid search, random search, and Bayesian optimization. Choose the method that best
fits your requirements.
• Setting the search space: The search space defines the range of values that the hyper parameters can
take. This can be defined manually or automatically, depending on the hyper parameter tuning method
that you have chosen.
• Defining the performance metric: The performance metric is the measure that you will use to evaluate
the performance of the model for a given set of hyper parameters. This can be defined based on the
specific problem that you are trying to solve.
• Configuring the hyper parameter tuning experiment: Depending on the Azure service or tool that you
are using, you may need to configure an experiment that defines the hyper parameters to be tuned, the
search space, the performance metric and other parameters.
• Start the hyper parameter tuning experiment: Once you have configured the hyper parameter tuning
experiment, you can start it and monitor the progress of the tuning process. The experiment will test
different combinations of hyper parameters and evaluate their performance based on the performance
metric that you have defined.
• Evaluate the results and choose the best hyper parameters: Once the hyper parameter tuning
experiment is complete, you can evaluate the results and choose the set of hyper parameters that result
in the best performance on the validation data.
By following steps, you can optimize the performance of your machine learning models and
achieve better results on your validation data. However, here are some general steps that you can follow to
configure hyper parameter tuning for your machine learning models in Azure:
Introduction to Machine Learning and Deep Learning
Machine learning is a discipline of computer science that uses computer algorithms and
analytics to build predictive models that can solve business problems.
Machine learning accesses vast amounts of data (both structured and unstructured) and learns
from it to predict the future. It learns from the data by using multiple algorithms and techniques.
Deep Learning is a subset of machine learning that deals with algorithms inspired by the
structure and function of the human brain. Deep learning algorithms can work with an enormous
amount of both structured and unstructured data. Deep learning’s core concept lies in artificial neural
networks, which enable machines to make decisions.
The major difference between deep learning vs. machine learning is the way data is presented
to the machine. Machine learning algorithms usually require structured data, whereas deep learning
networks work on multiple layers of artificial neural networks.
An example of a neural network that uses large sets of unlabeled data of eye
retinas
• Artificial Intelligence in Data Analysis: The scope of Al in data analytics is rising rapidly. One of the
ways Artificial Intelligence improve business is in the field of Data Analysis. Al would be able to
perceive patterns in data, whereas humans are not able to do so. Al can help data analysts with
handling and processing large data sets.
• Artificial Intelligence in Cyber Security: Cyber security is another field that's benefitting from Al. As
organizations are transferring their data to IT networks and cloud, the threat of hackers is becoming
more significant. Another field is fraud detection. Al can help in detecting frauds and help
organizations and people in avoiding scams. Credit card scam is one of the most common cybercrimes.
• Artificial Intelligence in Science and Research: Al is making lots of progress in the scientific sector.
Artificial Intelligence can handle large quantities of data and processes it quicker than human minds.
This makes it perfect for research where the sources contain high data volumes.
• Artificial Intelligence in home: Al has found a special place in people's homes in the form of Smart
Home Assistants. Amazon Echo and Google Home are popular smart home devices that let you
perform various tasks with just voice commands. One can use these smart assistants for various tasks
such as, playing a song, asking a question, buying something online, opening an app etc. Technology
has advanced in terms of Emotional Quotient. Virtual assistants Siri, Cortana & Alexa show how the
extent to which Artificial Intelligence comprehends human language. They are capable of
understanding the meaning from context and making smart judgments.
• Artificial Intelligence in Advertising: With the help of Artificial Intelligence, anyone can increase the
efficiency of sales and marketing organizations. The main focus of Artificial Intelligence will be on
improving conversion rates and sales. Personalized advertising, knowledge of customers, and their
behavior shine through facial recognition and can generate more revenue.
• Artificial Intelligence in Marketing: Al enabled platforms can easily help in managing marketing
operations across various channels like Google AdWords, Facebook and Bing.
• Artificial Intelligence in Agriculture: Farmers can use Artificial Intelligence to determine the optimal
date to sow crops precisely allocate resources such as water and fertilizer. With the help of Artificial
Intelligence farmers can identify crop diseases for swifter treatment and detect and destroy weeds. It
can also help farmers to forecast the year by using historical production data, long-term weather
forecasts, genetically modified seed information, and commodity pricing predictions, among other
inputs, to recommend how much seed to sow.
• Artificial Intelligence in Healthcare: The medical sector is also using this technology for its
advantages. Al is helping medical researchers and professionals in numerous ways. For example, the
Knight Career Institute and Intel have made a collaborative cancer cloud. This cloud takes data from
the medical history of cancer (and similar) patients to help doctors in making a better diagnosis.
Preventing cancer from moving to higher stages is its most effective treatment at this time.
• Artificial Intelligence in CRMs: CRM(Customer relationship management (CRM)) platforms that are
embedded with Al functionality can do real-time data analysis in order to provide predictions as well
as recommendations based on the company's unique business processes and customer data.
Part – II Data Analyst Associate
Businesses need data analysis more than ever. In this learning path, you will learn about the life and
journey of a data analyst, the skills, tasks, and processes they go through in order to tell a story with data so
trusted business decisions can be made. You will learn how the suite of Power BI tools and services are used
by a data analyst to tell a compelling story through reports and dashboards, and the need for true BI in the
enterprise.
As a data analyst, you are on a journey. Think about all the data that is being generated each day and
that is available in an organization, from transactional data in a traditional database, telemetry data from
services that you use, to signals that you get from different areas like social media.
Before data can be used to tell a story, it must be run through a process that makes it usable in the
story. Data analysis is the process of identifying, cleaning, transforming, and modeling data to discover
meaningful and useful information. The data is then crafted into a story through reports for analysis to support
the critical decision-making process.
While the process of data analysis focuses on the tasks of cleaning, modeling, and visualizing data, the
concept of data analysis and its importance to business should not be understated. To analyze data, core
components of analytics are divided into the following categories:
Descriptive: Descriptive analytics help answer questions about what has happened based on historical
data. Descriptive analytics techniques summarize large datasets to describe outcomes to stakeholders.
By developing key performance indicators (KPIs), these strategies can help track the success or
failure of key objectives. Metrics such as return on investment (ROI) are used in many industries, and
specialized metrics are developed to track performance in specific industries.
An example of descriptive analytics is generating reports to provide a view of an organization's
sales and financial data.
Diagnostic: Diagnostic analytics help answer questions about why events happened. Diagnostic
analytics techniques supplement basic descriptive analytics, and they use the findings from descriptive
analytics to discover the cause of these events. Then, performance indicators are further investigated to
discover why these events improved or became worse. Generally, this process occurs in three steps:
Identify anomalies in the data. These anomalies might be unexpected changes in a metric or a
particular market.
Use statistical techniques to discover relationships and trends that explain these anomalies.
Predictive: Predictive analytics help answer questions about what will happen in the future. Predictive
analytics techniques use historical data to identify trends and determine if they're likely to recur.
Predictive analytical tools provide valuable insight into what might happen in the future. Techniques
include a variety of statistical and machine learning techniques such as neural networks, decision trees,
and regression.
Prescriptive: Prescriptive analytics help answer questions about which actions should be taken to
achieve a goal or target. By using insights from prescriptive analytics, organizations can make data-
driven decisions. This technique allows businesses to make informed decisions in the face of
uncertainty. Prescriptive analytics techniques rely on machine learning as one of the strategies to find
patterns in large datasets. By analyzing past decisions and events, organizations can estimate the
likelihood of different outcomes.
Cognitive: Cognitive analytics attempt to draw inferences from existing data and patterns, derive
conclusions based on existing knowledge bases, and then add these findings back into the knowledge
base for future inferences, a self-learning feedback loop. Cognitive analytics help you learn what
might happen if circumstances change and determine how you might handle these situations.
Inferences aren't structured queries based on a rules database; rather, they're unstructured
hypotheses that are gathered from several sources and expressed with varying degrees of confidence.
Effective cognitive analytics depend on machine learning algorithms, and will use several natural
language processing concepts to make sense of previously untapped data sources, such as call center
conversation logs and product reviews.
Roles in Data
• Analyzing and evaluating the current business processes a company has and identifying areas of
improvement
• Researching and reviewing up-to-date business processes and new IT advancements to make systems
more modern
• Presenting ideas and findings in meetings
• Training and coaching staff members
• Creating initiatives depending on the business’s requirements and needs
• Developing projects and monitoring project performance
• Collaborating with users and stakeholders
• Working closely with senior management, partners, clients and technicians
• Using automated tools to extract data from primary and secondary sources
• Removing corrupted data and fixing coding errors and related problems
• Developing and maintaining databases, and data systems – reorganizing data in a readable format
• Performing analysis to assess the quality and meaning of data
• Filter Data by reviewing reports and performance indicators to identify and correct code problems
• Using statistical tools to identify, analyze, and interpret patterns and trends in complex data sets could
be helpful for the diagnosis and prediction
• Assigning numerical value to essential business functions so that business performance can be
assessed and compared over periods of time.
• Analyzing local, national, and global trends that impact both the organization and the industry
• Preparing reports for the management stating trends, patterns, and predictions using relevant data
• Working with programmers, engineers, and management heads to identify process improvement
opportunities, propose system modifications, and devise data governance strategies.
• Preparing final analysis reports for the stakeholders to understand the data-analysis steps, enabling
them to take important decisions based on various facts and trends
Data Engineer Skills:
• Programming Skills
• Statistics
• Machine Learning
• Strong Math Skills (Multivariable Calculus and Linear Algebra)
• Data Wrangling – proficiency in handling imperfections in data is an important aspect of a data
scientist job description.
• Excellent Communication Skills
• Strong Software Engineering Background
• Hands-on experience with data science tools
• Problem-solving aptitude
• Analytical mind and great business sense
• Degree in Computer Science, Engineering or relevant field is preferred
• Proven Experience as Data Analyst or Data Scientist
Features of Power BI
1. Data sources: Power BI can connect to a wide variety of data sources including Excel spreadsheets,
SQL databases, cloud-based sources like Azure, and many more.
2. Data transformation: Once data is connected, Power BI provides tools to transform and clean the data,
such as removing duplicates, merging tables, or creating calculated columns.
3. Visualizations: Power BI allows users to create a wide variety of visualizations, including charts,
maps, and tables, which can be customized to meet specific needs. These visualizations can then be
combined into dashboards for a complete view of the data.
4. Collaboration: Power BI allows users to collaborate on reports and dashboards with others in real-
time, making it easy to share insights and work together.
5. Mobile Access: Power BI provides mobile apps for iOS, Android, and Windows devices, allowing
users to access their reports and dashboards from anywhere. Power BI is a business analytics and data
visualization tool developed by Microsoft. It allows users to connect to various data sources, transform
and clean the data, and create interactive visualizations and reports.
Prepare Data For Analysis
Viewing data in its raw format rarely provides useful insights into the patterns and trends
within the data. By applying some core analysis techniques to our data, we can transform it from
unreadable columns and rows into visualizations that immediately reveal summaries and insights.
Aggregate data
Aggregate functions are used to return a summary value from your dataset, and are a
fundamental component of data analysis. These functions help us answer specific questions about our
business, such as how many customers visited the Tokyo store last Tuesday, or the average spend of
each online customer in December.
• Sum: This common function summarizes the total values within a field. It's used to return values such
as total sales or revenue.
• Average: The average function calculates the sum of a field, divided by the number of records. For
example, to discover the average customer spend, the average function would run a sum calculation
against the spend of all customers, and then divide it by the number of customers in the table.
• Maximum: This is the opposite of the minimum function, and returns the highest value in a field
Here are some key concepts related to data modeling in Power BI:
1. Tables: Power BI uses tables to store data. Tables can be imported from external data sources or
created from scratch within Power Bl. Each table contains rows of data, with each row representing a
unique record.
2. Relationships: Power BI allows users to create relationships between tables based on shared columns.
Relationships allow users to combine data from multiple tables in a single report or visualization.
3. Calculated columns and measures: These are two types of data calculations that can be performed in
Power BI. Calculated columns allow users to create new columns based on existing data, while
measures are calculations that are based on data in multiple tables.
4. Hierarchies: Power BI allows users to create hierarchies based on columns in their data. Hierarchies
allow users to drill down into data to explore more detailed information.
5. DAX: Data Analysis Expressions (DAX) is a formula language used in Power BI to create calculated
columns, measures, and other types of data calculations
• Bar chart: Bar charts, also known as column charts, compare numeric values across categories. You
can have multiple series or conditions and, like a line chart, the bar chart includes an X and Y axis.
Again, it's good practice to include a title on your chart, and a legend if you have multiple series.
Scatter chart: You can use scatter charts when you need to compare two numeric values. Rather than using
time along the X axis, you have used another field. For example, Calculating Salary based on their Excel
Knowledge Score.
A data analyst is one of several critical roles in an organization, who help uncover and make sense of
information to keep the company balanced and operating efficiently. Therefore, it's vital that a data analyst
clearly understands their responsibilities and the tasks that are performed on a near-daily basis. Data analysts
are essential in helping organizations gain valuable insights into the expanse of data that they have, and they
work closely with others in the organization to help reveal valuable information.
The following figure shows the five key areas that you'll engage in during the data analysis process.
Prepare:
• As a data analyst, you'll likely divide most of your time between the prepare and model tasks.
Deficient or incorrect data can have a major impact that results in invalid reports, a loss of trust, and a
negative effect on business decisions, which can lead to loss in revenue, a negative business impact,
and more.
• Before a report can be created, data must be prepared. Data preparation is the process of profiling,
cleaning, and transforming your data to get it ready to model and visualize.
• Data preparation is the process of taking raw data and turning it into information that is trusted and
understandable. It involves, among other things, ensuring the integrity of the data, correcting wrong or
inaccurate data, identifying missing data, converting data from one structure to another or from one
type to another, or even a task as simple as making data more readable.
• Data preparation also involves understanding how you're going to get and connect to the data and the
performance implications of the decisions. When connecting to data, you need to make decisions to
ensure that models and reports meet, and perform to, acknowledged requirements and expectations.
• Privacy and security assurances are also important. These assurances can include anonymizing data to
avoid oversharing or preventing people from seeing personally identifiable information when it isn't
needed. Alternatively, helping to ensure privacy and security can involve removing that data
completely if it doesn't fit in with the story that you're trying to shape.
• Data preparation can often be a lengthy process. Data analysts follow a series of steps and methods to
prepare data for placement into a proper context and state that eliminate poor data quality and allow it
to be turned into valuable insights.
Model:
• When the data is in a proper state, it's ready to be modeled. Data modeling is the process of
determining how your tables are related to each other. This process is done by defining and creating
relationships between the tables. From that point, you can enhance the model by defining metrics and
adding custom calculations to enrich your data.
• From a Power BI perspective, if your report is performing slowly, or your refreshes are taking a long
time, you will likely need to revisit the data preparation and modeling tasks to optimize your report.
• The process of preparing data and modeling data is an iterative process. Data preparation is the first
task in data analysis. Understanding and preparing your data before you model it will make the
modeling step much easier
• Creating an effective and proper data model is a critical step in helping organizations understand and
gain valuable insights into the data. An effective data model makes reports more accurate, allows the
data to be explored faster and efficiently, decreases time for the report writing process, and simplifies
future report maintenance.
• The model is another critical component that has a direct effect on the performance of your report and
overall data analysis. A poorly designed model can have a drastically negative impact on the general
accuracy and performance of your report. Conversely, a well-designed model with well-prepared data
will ensure a properly efficient and trusted report. This notion is more prevalent when you are working
with data at scale.
Visualize:
• The visualization task is where you get to bring your data to life. The ultimate goal of the visualize
task is to solve business problems. A well-designed report should tell a compelling story about that
data, which will enable business decision makers to quickly gain needed insights. By using appropriate
visualizations and interactions, you can provide an effective report that guides the reader through the
content quickly and efficiently, therefore allowing the reader to follow a narrative into the data.
• The reports that are created during the visualization task help businesses and decision makers
understand what that data means so that accurate and vital decisions can be made. Reports drive the
overall actions, decisions, and behaviors of an organization that is trusting and relying on the
information that is discovered in the data.
• The business might communicate that they need all data points on a given report to help them make
decisions. As a data analyst, you should take the time to fully understand the problem that the business
is trying to solve. Determine whether all their data points are necessary because too much data can
make detecting key points difficult. Having a small and concise data story can help find insights
quickly.
• With the built-in AI capabilities in Power BI, data analysts can build powerful reports, without writing
any code that enable users to get insights and answers and find actionable objectives. The AI
capabilities in Power BI, such as the built-in AI visuals, enable the discovering of data by asking
questions, using the Quick Insights feature, or creating machine learning models directly within Power
BI.
• An important aspect of visualizing data is designing and creating reports for accessibility. As you build
reports, it is important to think about people who will be accessing and reading the reports. Reports
should be designed with accessibility in mind from the outset so that no special modifications are
needed in the future.
• Many components of your report will help with storytelling. From a color scheme that is
complementary and accessible, to fonts and sizing, to picking the right visuals for what is being
displayed, they all come together to tell that story.
Analyse:
• The analyze task is the important step of understanding and interpreting the information that is
displayed on the report. In your role as a data analyst, you should understand the analytical capabilities
of Power BI and use those capabilities to find insights, identify patterns and trends, predict outcomes,
and then communicate those insights in a way that everyone can understand.
• Advanced analytics enables businesses and organizations to ultimately drive better decisions
throughout the business and create actionable insights and meaningful results. With advanced
analytics, organizations can drill into the data to predict future patterns and trends, identify activities
and behaviors, and enable businesses to ask the appropriate questions about their data.
• Previously, analyzing data was a difficult and intricate process that was typically performed by data
engineers or data scientists. Today, Power BI makes data analysis accessible, which simplifies the data
analysis process. Users can quickly gain insights into their data by using visuals and metrics directly
from their desktop and then publish those insights to dashboards so that others can find needed
information.
• This feature is another area where AI integrations within Power BI can take your analysis to the next
level. Integrations with Azure Machine Learning, cognitive services, and built-in AI visuals will help
to enrich your data and analysis.
Manage:
• Power BI consists of many components, including reports, dashboards, workspaces, datasets, and
more. As a data analyst, you are responsible for the management of these Power BI assets, overseeing
the sharing and distribution of items, such as reports and dashboards, and ensuring the security of
Power BI assets.
• Apps can be a valuable distribution method for your content and allow easier management for large
audiences. This feature also allows you to have custom navigation experiences and link to other assets
within your organization to complement your reports.
• The management of your content helps to foster collaboration between teams and individuals. Sharing
and discovery of your content is important for the right people to get the answers that they need. It is
also important to help ensure that items are secure. You want to make sure that the right people have
access and that you are not leaking data past the correct stakeholders.
• Proper management can also help reduce data silos within your organization. Data duplication can
make managing and introducing data latency difficult when resources are overused. Power BI helps
reduce data silos with the use of shared datasets, and it allows you to reuse data that you have prepared
and modeled. For key business data, endorsing a dataset as certified can help to ensure trust in that
data.
• The management of Power BI assets helps reduce the duplication of efforts and helps ensure security
of the data.
A workspace is a centralized repository in which you can collaborate with colleagues and teams to
create collections of reports and dashboards.
a) Focused collaboration efforts. You can use workspaces to house reports and dashboards for use by
multiple teams.
c) Assurance that the highest level of security is maintained by controlling who can access datasets,
reports, and dashboards.
This module will discuss several tasks that are focused on helping you to create and manage a
workspace in Power Bl. Additionally, you will learn about importing and updating assets in a
workspace, configuring data protection, troubleshooting data, and much more.
Consider a scenario where you have created a few reports for the Sales team at Tailwind Traders.
The issue that you have encountered is determining how to make these reports viewable and shareable.
By creating a workspace in Power BI, you can house your reports in one location, make them
shareable, collaborate with other teams, and update reports.
Create a workspace
1. Go to Power BI service.
3. Select the Create a workspace button at the bottom of the resulting panel.
4. In the Create a workspace window, enter information in the Workspace name and Description fields
and then upload a Workspace image.
5. In the Advanced drop-down menu, you can create a Contact list of users who will receive notifications
if issues with the workspace occur.
By default, these users are the workspace admins, but you can also add specific users. You can
also add this workspace to a specific One Drive and then choose whether this workspace will be a part
of a dedicated capacity or not. Dedicated capacities are Power BI Premium features that ensure that
your workspace will have its own computational resources as opposed to sharing resources with other
users.
6. After you have filled out pertinent fields on the Create a workspace window, select Save.
Now that you've successfully created a workspace, the Sales team wants to collaborate
with other teams to build additional dashboards and reports. As the workspace owner, you want to ensure that
appropriate access is given to members of the Products team because their team includes stakeholders and
developers. Workspace roles allow you to designate who can do what within a workspace.
• Admin
• Contributor
Cannot publish, update, or edit an app in a workspace unless given this ability by admins/members
Can create, update, and publish content and reports within a workspace
• Viewer
If the workspace is backed by a Premium capacity, a non-Pro user can view content
within the workspace under the Viewer role.
To assign these roles to users, go to the workspace that you've created and, in the upper- left comer of the
ribbon, select Access.
In the resulting Access window, you can add email addresses of individual users, mail-
enabled security groups, distribution lists, Microsoft 365 groups, and regular security groups. and then assign
them to their specific roles. You can also change the user's assigned role a the bottom of the page and delete
the user from the workspace by selecting the ellipsis (...) next to their name.