0% found this document useful (0 votes)
17 views4 pages

Unit 1 Computational Statistics

Statistics is the science of collecting, organizing, summarizing, analyzing, and interpreting information, which helps in drawing conclusions from data. The document explains different types of data, including categorical and numerical, and their subcategories, as well as univariate, bivariate, and multivariate analysis. It also covers key statistical concepts such as mean, median, mode, variance, standard deviation, and harmonic mean.

Uploaded by

anseltemp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views4 pages

Unit 1 Computational Statistics

Statistics is the science of collecting, organizing, summarizing, analyzing, and interpreting information, which helps in drawing conclusions from data. The document explains different types of data, including categorical and numerical, and their subcategories, as well as univariate, bivariate, and multivariate analysis. It also covers key statistical concepts such as mean, median, mode, variance, standard deviation, and harmonic mean.

Uploaded by

anseltemp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Computational Statistics

Unit 1

Q. What is statistics?
Ans. Statistics is the science of collecting, organizing, summarizing, analysing, and
interpreting information. Good statistics are used to draw conclusions about a sample. In
many situations of our life, statistics can help us perceive what we know and what we don't
know.
For example, it can turn a vague statement like "This medication may cause nausea," or
"You could die if you don't take this medication" into a specific statement like "Three in one
thousand patients had experienced nausea when they took this medication," or "If you don't
take this medication, there is a 95% chance that you will die." Without statistics, the
interpretation of data can quickly become massively flawed. Hence there arises a need for
statistics.

Statistical Data
Q. Categorical Data
Ans. Categorical data refers to a form of information that can be stored and identified
based on their names or labels. It is a type of qualitative data that can be grouped into
categories instead of being measured numerically. Categorical measurements are not given
in numbers but rather in natural language descriptions. Numbers can sometimes represent
it, but those numbers don’t mean anything mathematically.
For example: Birthdate, favourite sport, hair colour, height. This data type is made up of
categorical variables that show things like a person’s gender, hometown, and so on. In the
above example, both the birthdate and the postcode are made up of numbers. It is regarded
as categorical data even though it includes numbers.
Calculating the average is a simple way to determine if the provided data is categorical or
numerical. If you can figure out the average, it is considered numerical data. If you can’t
figure out the average, then it’s considered categorical data.
Types:
a) Boolean: Boolean data are data which can only have two possible values. For example:
female/male, smoker/non-smoker, True/False
b) Nominal: Sometimes classifications require more than two categories. Such data is called
nominal data. Example: married/single/divorced.
c) Ordinal: In contrast to nominal data, ordinal data are ordered and have a logical sequence.
Example: very few/few/some/many/very many.
Q. Numerical (Continuous)

Ans. Numerical data, as the name suggests, consists of numbers. It represents quantitative
information and can be measured and counted. This data type is often used to perform
mathematical operations and statistical analysis. It is a cornerstone in making informed
decisions, drawing conclusions, and discovering patterns. A numerical variable is something
blocking an infinite value.

Example: age and weight test results.

Numerical data variables can be further categorized into two main types: discrete and
continuous data.

1. Discrete Data

Discrete data consists of distinct and separate values. These values are typically integers and
do not have fractional or decimal components. Example: number of students in a class,
number of cars in a parking lot, number of customer complaints. You can’t have 0.5
complaint or 1.5 student.

2. Continuous Data

Continuous data, on the other hand, can take any value within a specific range. These values
can be integers or decimals. Example: height of individuals(6ft1), temperature(23.2°c),
weight(23.5kg).

Q. Univariate and Bivariate Analysis

Ans.

Univariate Bivariate Multivariate


Univariate data refers to a Bivariate data involves two Multivariate data refers to
type of data in which each different variables, and the datasets where each
observation or data point analysis of this type of data observation or sample point
corresponds to a single focuses on understanding the consists of multiple variables or
variable. relationship or association features.
between these two variables.
It does not deal with causes It does deal with causes and It does not deal with causes
and relationships. relationships and analysis is done. and relationships and analysis is
done.
It does not contain any It does contain only one It is similar to bivariate but it
dependent variable. dependent variable. contains more than 2 variables.
The main purpose is to The main purpose is to explain. The main purpose is to study
describe. the relationship among them.
The example of a univariate The example of bivariate can be Example: Suppose an advertiser
can be height. temperature and ice sales in wants to compare the
summer vacation. popularity of four
advertisements on a website.
Then their click rates could be
measured for both men and
women and relationships
between variable can be
examined
Common visualizations Common visualizations include Common visualizations include
include histograms, box scatter plots, correlation matrices, 3D scatter plots, heatmaps, and
plots, and bar charts. and line graphs. parallel coordinate plots.

Q. Mean, Median, Mode, Standard Deviation, Harmonic Mean

Mean: The arithmetic mean of a variable, often called the average, is computed by adding
up all the values and dividing by the total number of values

Eg: Data set: 1,2,3,4,5 Mean= (1+2+3+4+5)/5

Median: The median of a variable is the middle value of the data set when the data are
sorted in order from least to greatest. It splits the data into two equal halves with 50% of
the data below the median and 50% above the median.

Data Set: 23, 27, 29, 31, 35, 39, 40, 42, 44, 47, 51 ODD

Median: 39

To calculate the median with an even number of values (n is even), first sort the data from
smallest to largest and take the average of the two middle values.

23, 27, 29, 31, 35, 39, 40, 42, 44, 47

Mode: The mode is the most frequently occurring value in the dataset.
Variance: The variance is the ratio of the sum of the square of the difference between each
value and its arithmetic mean to the no of elements minus 1.

Standard Deviation: It’s the measure of the amount of variance in the data. The standard
deviation is the square root of the variance.

Harmonic Mean: The Harmonic Mean (HM) is defined as the ratio of no of elements in data
set to the sum of the reciprocal of the values.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy