LL LL LLLLL LLLLL
LL LL LLLLL LLLLL
AI PROJECT CYCLE
Unit 2 : INTRODUCTION TO AI PROJECT
CYCLE
Example, If you want to make an AI system which can predict the salary of any employee
based on his previous salaries, you would feed the data of his previous salaries into the
machine. This is the data with which the machine can be trained. Now, once it is ready, it will
predict his next salary efficiently. The previous salary data here is known as Training Data
while the next salary prediction data set is known as the Testing Data.
Data sources
• Various ways to collect data are
• find a reliable source of data from where
some authentic information can be taken.
• data can be open-sourced and not
someone’s property.Extracting private
data can be an offence.
• One of the most reliable and authentic sources of information,
are the open-sourced websites hosted by the government.
1. Web Scraping means collecting data from web using some technologies We
use it for monitoring prices, news and etc Example Web Scrapping
2. Sensors are very Important but very simple to understand. Sensors are the part
of IoT Internet of things which collect the physical data and detect the changes
3. Camera captures the visual information and then that information which is called
image is used as a source of data. Cameras are used to capture raw visual data
4. Observations: When we observe something carefully we get some information
For ex Scientists Observe creatures to study them Observations is a time
consuming data source.
5. API :Application Programming interface API is a messenger which takes
requests and tells the system about requests and gives the response
Ex Twitter API, Google Search API
6. Surveys: The survey is a method of gathering specific information from a sample
of people Example a census survey for analyzing the population
TYPES OF DATA
The structure classification is divided into 3 categories:
1. Structured Data : It can have a specific pattern or set of rules. These data have a
simple structure and stores the data in specific forms such as tabular form.
Example, The cricket scoreboard, Your school time table, Exam datasheet etc.
2. Unstructured Data : The data structure which doesn't have any specific pattern
or constraints as well as can be stored in any form is known as unstructured data.
Mostly the data that exists in the world is unstructured data. Example, Youtube
Videos, Facebook Photos, Dashboard data of any reporting tool etc.
3. Semi-Structured Data : It is the combination of both structured and unstructured
data. Some data can have a structure like a database whereas some data can
have markers and tags to identify the structure of data.
III DATA EXPLORATION
• To analyse the data,
you need to visualise it
in some user-friendly
format so that you can:
quickly get a sense of
the trends,
relationships and
patterns contained
within the data.
• To visualise data, we
can use various types
of visual
Data Exploration refers to the techniques and tools used to
visualize data through complex statistical methods.
Advantages of Data Visualization
❖A better understanding and provides insights into data
❖Allows user interaction
❖Provide real time analysis
❖Help to make decisions
❖Reduces complexity of data
❖Provides the relationships and patterns contained within
data
❖Define a strategy for your data model
❖Provides an effective way of communication among users
How to select a proper graph?
1.Comparison of Values -Show periodical changes.
ie Bar Chart
2. Comparison of Trends -Show changes over a period of
time
ie Line Chart
3. Distribution of Data according to categories-Show data
according to category ie Histogram
4.Highlight a portion of a whole Highlight data according to
value ie Pie Chart
5.Show the relationship between data -Multiple charts can be
IV MODELING
AI Modelling refers to developing algorithms, also
called models which can be trained to get intelligent
outputs or writing codes to make a machine artificially
intelligent
● The graphical representation makes the data
understandable for humans as we can discover
trends and patterns out of it
•But when it comes to machine accessing and
analysing data, it needs the data in the most basic
form of numbers (which is binary 0 s and 1 s) and
when it comes to discovering patterns and trends in
data, the machine goes for mathematical
representations of the same
Rule Based Approach