0% found this document useful (0 votes)
24 views40 pages

AI ML June 4 2022

Ai-ml pdf

Uploaded by

Likhitha M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views40 pages

AI ML June 4 2022

Ai-ml pdf

Uploaded by

Likhitha M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Data

Science
• Data science continues to evolve as one of the most promising and in-demand
career paths for skilled professionals.

•data scientists are able to identify relevant questions, collect data from a
multitude of different data sources, organize the information, translate results
into solutions, and communicate their findings in a way that positively affects
business decisions.

• Glassdoor ranked data scientist among the top three jobs in America since 2016
Why should we learn Data Science?
• A fuel of 21st Century
• Problem of Demand & Supply
• A Lucrative Career
• Data Science can make the World a Better Place
• Data Science is the Career of Tomorrow
Data Science Process
Data Science tools

•R

• Python
R versus Python

• Data collection
• Data exploration
• Data modeling
• Data visualization
Data Science – A formal definition

• Data science combines multiple fields, including statistics, scientific methods, artificial
intelligence (AI), and data analysis, to extract value from data. Those who practice data science
are called data scientists, and they combine a range of skills to analyze data collected from the
web, smart phones, customers, sensors, and other sources to derive actionable insights.

• Companies are sitting on a treasure trove of data. As modern technology has enabled the
creation and storage of increasing amounts of information, data volumes have exploded. It’s
estimated that 90 percent of the data in the world was created in the last two years. For example,
Facebook users upload 10 million photos every hour.
Difference between data science, artificial intelligence and
machine learning

• Here’s a simple breakdown:


• AI means getting a computer to mimic human behavior in
some way.
• Data science is a subset of AI, and it refers more to the
overlapping areas of statistics, scientific methods, and
data analysis—all of which are used to extract meaning
and insights from data..
• Machine learning is another subset of AI, and it consists
of the techniques that enable computers to figure things
out from the data and deliver AI applications.
And for good measure, we’ll throw in another definition.
• Deep learning which is a subset of machine learning that
enables computers to solve more complex problems.
How data science is transforming
business
Organizations are using data science to turn data into a competitive advantage
by refining products and services. Data science and machine learning use
cases include:
• Determine customer churn by analyzing data collected from call centers, so
marketing can take action to retain them
• Improve efficiency by analyzing traffic patterns, weather conditions, and other
factors so logistics companies can improve delivery speeds and reduce costs
• Improve patient diagnoses by analyzing medical test data and reported
symptoms so doctors can diagnose diseases earlier and treat them more
effectively
• Optimize the supply chain by predicting when equipment will break down
• Detect fraud in financial services by recognizing suspicious behaviors and
anomalous actions
• Improve sales by creating recommendations for customers based upon
previous purchases
What is Data Analytics?
• Data analytics is the science of analyzing raw data
to make conclusions about that information.
• The techniques and processes of data analytics
have been automated into algorithms that work
over raw data for human consumption.
• Data analytics help a business optimize its
performance.
Data Science versus Data Analytics

• Data science is an umbrella term for a group of


fields that are used to mine large datasets. Data
analytics software is a more focused version of
this and can even be considered part of the larger
process.

• Analytics is devoted to realizing actionable


insights that can be applied immediately based on
existing queries.
Applications of Data Science
• Fraud and Risk Detection
• Healthcare
• Drug discovery & development
• Image & Speech recognition
• Airline route planning
• Gaming
• Banking
• E-commerce
• Transport
• Education
Data science trends in 2022
• Small Data and TinyML
• Data-driven Customer Experience
• Deepfakes, generative AI, and synthetic data
• Convergence
• AutoML
Python essentials

• Python is a general-purpose interpreted,


interactive, object-oriented, and high-level
programming language. It was created by Guido
van Rossum during 1985- 1990.
Dictionary lists sets and tuples
• Lists

• Tuples

• Set

• Dictionary
Declarations
Built-in functions
• append()
• pop()
• update()
• extend()
• count()
• reverse()
• min() & max()
• range()
• sqrt()
• gcd()
• factorial()
Built-in functions
• upper()
• floor()
• ceil()
• len()
• remove()
• reverse()
• sin()
• cos()
• tan()
Finding logarithm
Finding the Logarithm
• log() function returns the logarithmic value of a
with base b. If the base is not mentioned, the
computed value is of the natural log.
• log2(a) function computes value of log a with base
2. This value is more accurate than the value of
the function discussed above.
• log10(a) function computes value of log a with
base 10. This value is more accurate than the
value of the function discussed above.
Printing calendar in python
The calendar module gives a wide range of methods to play with yearly
and monthly calendars.

Here, we print a calendar for a given month ( Jan 2008 )


range() sort() & sorted()
• The range() method returns an immutable sequence of numbers
between the given start integer to the stop integer.

• The sorted() function sorts the elements of a given iterable in a


specific order (ascending or descending) and returns it as a list.
Population
• Population includes all the elements from the
data set and measurable characteristics of the
population such as mean and standard deviation
• There are different types of population. They
are:
• Finite Population
• Infinite Population
• Existent Population
• Hypothetical Population
Sample

• A sample represents the group of interest from


the population, which you will use to represent
the data. The sample is an unbiased subset of
the population that best represents the whole
data.
Sampling
• The process of collecting data from a small subsection of
the population and then using it to generalize over the
entire set is called Sampling.

• Samples are used when :

• The population is too large to collect data.


• The data collected is not reliable.
• The population is hypothetical and is unlimited in size.
Take the example of a study that documents the results
of a new medical procedure. It is unknown how the
procedure will affect people across the globe, so a test
group is used to find out how people react to it.
Population versus Sample
What is a dataset?

• A data set consists of roughly two components.


The two components are rows and columns.
Additionally, a key feature of a data set is that it is
organized so that each row contains one
observation.
Example of a dataset
Importing files in Python
Pandas package

• Install and Load pandas Package


• pandas is a powerful data analysis package. It
makes data exploration and manipulation easy. It
has several functions to read data from various
sources.
• If you are using Anaconda, pandas must be
already installed.
Import CSV files
• It is important to note that a singlebackslash does not work when
specifying the file path. You need to either change it to forward slash
or add one more backslash like below

• import pandas as pd
• mydata= pd.read_csv("C:\\Users\\smitha\\Documents\\file1.csv")
• Import files from URL
• mydata = pd.read_csv("http://winterolympicsmedals.com/medals.csv")

• Read Text File


• mydata = pd.read_table("C:\\Users\\Deepanshu\\Desktop\\example2.txt")

• Read Excel File


• mydata =
pd.read_excel("https://www.eia.gov/dnav/pet/hist_xls/RBRTEd.xls",sheetnam
e="Data 1", skiprows=2)

• Read delimited file


• mydata2 =
pd.read_table("http://www.ssc.wisc.edu/~bhansen/econometrics/invest.dat",
sep="\s+", header = None)
• Read SAS File
mydata4 = pd.read_sas('cars.sas7bdat')

• Read Stata file


• mydata41 = pd.read_stata('cars.dta')

• Import R Data File


Using pyreadr package, you can load .RData and .Rds format files which in general contains
R data frame.

• Read from SQL table


import pandas as pd
import pyodbc
conn = pyodbc.connect("Driver={SQL
Server};Server=serverName;UID=UserName;PWD=Password;Database=RCO_DW;")
df = pd.read_sql_query('select * from dbo.Table WHERE ID > 10', conn)
df.head()
What is Hypothesis?
• A hypothesis is an assumption, an idea that is proposed for the sake
of argument so that it can be tested to see if it might be true.
• Independent & Dependent variables

• an independent variable stands on its own and is not changed by


other variables

• the dependent variable depends on other factors


Types of Hypotheses
The most common forms of hypotheses are:

• Simple Hypothesis

• Complex Hypothesis

• Null Hypothesis

• Alternative Hypothesis

• Logical Hypothesis

• Empirical Hypothesis

• Statistical Hypothesis
Simple Hypothesis
• A simple hypothesis predicts the relationship between two variables:
the independent variable and the dependent variable.

• Drinking sugary drinks daily leads to being overweight.

• Smoking cigarettes daily leads to lung cancer.

• Getting at least 8 hours of sleep can make people more alert.


Complex hypothesis
• A complex hypothesis describes a relationship between variables.
However, it’s a relationship between two or more independent
variables and two or more dependent variables.

• Adults who 1) drink sugary beverages on a daily basis and 2) have a


family history of health issues are more likely to 1) become
overweight and 2) develop diabetes or other health issues.
Null hypothesis
• A null hypothesis, denoted by H0, proposes that two factors or groups
are unrelated and that there is no difference between certain
characteristics of a population or process.

• There is no significant change in an individual’s work habits whether


they get eight hours or nine hours of sleep.
Empirical hypothesis
• An empirical hypothesis, or working hypothesis, comes to life when a
theory is being put to the test using observation and experiment. It's
no longer just an idea or notion. Rather, it is going through trial and
error and perhaps changing around those independent variables.
Statistical Hypothesis
• A statistical hypothesis is an examination of a
portion of a population or statistical model. In
this type of analysis, you use statistical
information from an area.
• For example, if you wanted to conduct a study
on the life expectancy of people from Coorg, you
would want to examine every single resident of
Coorg.
• 50% of Coorg’s population lives beyond the age
of 70.
Alternate Hypothesis
• An alternative hypothesis, denoted by H1 or HA , is
a claim that is contradictory to the null hypothesis.
Researchers will pair the alternative hypothesis
with the null hypothesis in order to prove that
there is no relation.

• Work habits improve during the times when one


gets 8 hours of sleep only, as opposed to 9 hours
of sleep only.
Logical Hypothesis
• A logical hypothesis is a proposed explanation using limited evidence.
Generally, you want to turn a logical hypothesis into an empirical
hypothesis, putting your theories or postulations to the test.

• Creatures found at the bottom of the ocean use anaerobic respiration


rather than aerobic respiration.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy