Statistics is a branch of science focused on data collection, organization, analysis, and interpretation, divided into descriptive and inferential statistics. Descriptive statistics summarize sample data using measures like mean and median, while inferential statistics draw conclusions about populations based on sample data. The document also discusses variables, their types, and levels of measurement, emphasizing the importance of understanding data characteristics for effective statistical analysis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
16 views42 pages
1 Statistics-And-Its-Bg
Statistics is a branch of science focused on data collection, organization, analysis, and interpretation, divided into descriptive and inferential statistics. Descriptive statistics summarize sample data using measures like mean and median, while inferential statistics draw conclusions about populations based on sample data. The document also discusses variables, their types, and levels of measurement, emphasizing the importance of understanding data characteristics for effective statistical analysis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42
Statistics [as a science ]is a branch of
science dealing with data collection,
organization, presentation, analysis and interpretation.
Statistics [as a measure ] is any
descriptive form of measurement such as a mean, median, standard deviation, etc., which are computed from sample data. Statistics, as a science, has two divisions: Descriptive Statistics • It is concerned with developing and utilizing techniques for the effective presentation of numerical information. • It focuses on the subsequent task of collecting, organizing, and presenting the data. • It aims to summarize the sample using statistical measures or descriptive measures such as average, median, standard deviation, variance, etc. For example, if we look at a football team’s scores over a particular season, we can compute the average score, the typical score, the highest and lowest scores, variance, etc., and get a statistical description or profile for the team. Inferential Statistics or analytical statistics • It is concerned with developing and utilizing techniques for properly analyzing numerical information. • It focuses on inductive and deductive reasoning. • It aims to draw a conclusion about the population from the sample at hand. For example, a medical doctor may try to infer the success rate of arachidonic acid ( a polyunsaturated fatty acid present in our body and is abundant in the brain, muscles, and liver and is found in certain foods, such as meat, egg yolk, and shellfish) in children’s diet to decrease the risk of asthma. Draw conclusion about the Describes the sample using population using the data statistical measures. from life sample. 1. Identify whether the statement describes inferential statistics or descriptive statistics: a) The average age of the students in a statistics class is 21 years. b) The chances of winning the California Lottery are one chance in twenty-two million. c) There is a relationship between smoking cigarettes and getting emphysema. d) From past figures, it is predicted that 39% of the registered voters in California will vote in the June primary. 1. Identify whether the statement describes inferential statistics or descriptive statistics: a) The average age of the students in a statistics class is 21 years. (Descriptive) b) The chances of winning the California Lottery are one chance in twenty-two million. (Inferential) c) There is a relationship between smoking cigarettes and getting emphysema. (Inferential) d) From past figures, it is predicted that 39% of the registered voters in California will vote in the June primary. (Inferential) STATISTICAL INVESTIGATION focuses on people or things with characteristics in which someone is interested to seek information and learn more about the real-world situation.
The Data Set is
The The datum {pl. The frame is any collection of elementary data} is any observations about the complete unit is the single one or more person or list of all observation characteristics of interest, for one or object elementary about a specified more elementary possessing the units characteristic units. A data set can interest. It is the characteristics pertinent to a be univariate basic unit of a (possessing only one that interest statistical statistician’s raw characteristic), the statistician. investigation. material. bivariate, or multivariate. In statistics, the term population takes on a slightly different meaning. The population in statistics includes all members of a defined group that is under study. It also refers to the entire collection of all possible observations about specified characteristics of interest. A portion or a part of the population is called a “sample”. If a sample is randomly drawn from the population, it is expected to possess all the characteristics of the population. Samples are randomly drawn when (1) each member of the population is given an equal chance of becoming a part of the sample, and (2) the selection of one member is independent of the selection of interest. Suppose there are 5000 students in a particular college and 50 students are taken to form the sample. The names of 5,000 students may be written on pieces of paper, put in a box, and shuffled well. From that box, we draw 50 names. The 50 names drawn constitute the sample, or more specifically, the random sample; while the 5,0000 students are the population. Since the 50 students is a random sample, whatever result that can be achieved from this sample can be generalized to the 5,0000 students in that college; and if the characteristics of the 50 students are matched with that of 5,000 students, one will notice their similarities. The number of items in a FINITE POPULATION is known as the population size, denoted by the letter N, and the number of items in a sample, sample size, is donated by n. Thus, in our example, we have a sample size of n = 20 selected from a population of N = 1200. A variable is a quantity that is characterized by a sample population, which assumes a succession of values observed. What is the difference between quantitative and qualitative variables? Quantitative data is numbers-based, countable, or measurable. Qualitative data is interpretation-based, descriptive, and related to language. Quantitative data tells us how many, how much, or how often in calculations. Qualitative data can help us to understand why, how, or what happened behind certain behaviors It is a variable that is not expressed numerically because it differs in kind rather than in degrees among elementary units. These variables can be dichotomous or multinomial.
DICHOTOMOUS VARIABLE. Observations about this variable can be made
only in two categories. For example, male or female, employed or unemployed, correct or incorrect, defective or non-defective, absent or present, etc.
MULTINOMIAL VARIABLE. Observations about this variable can be made in
more than two categories. For example, job title, color, language, religion, type of business, etc. It is a variable that is normally expressed numerically because it differs in degree rather than in kind among the members of the group. They can be discrete or continuous DISCRETE VARIABLE. Produces numerical responses that arise from count data. This type of variable assume values only at specific points on a scale of values, with gaps between them, such as the number of children in the family, number of students in a classroom, number of schools in Manila, number of T-shirts produced by a manufacturer, number of persons afflicted with H-fever, etc.
CONTINUOUS VARIABLE. Continuous variables can take on numerical
responses that arise from measured data. Observations of this type of variable can assume values at all points on a scale of values, with no breaks between possible values, such as height, weight, volume, temperature, etc. Draw conclusion about the population using the data from life sample. Variables are sometimes specified according to their intended use. In many studies, it is customary to record more than one variable per case. It is often the aim of many studies to determine if and how one or more variables affect another. In general, identifying the use of the variables is parallel to deciding which variable(s) would be used to predict another. These variables may be considered dependent or independent variables. DEPENDENT VARIABLE OR PREDICTED VARIABLE. It is a variable you would be interested in predicting. It is the outcome of the study which is why it is sometimes called OUTCOME VARIABLE. INDEPENDENT VARIABLE OR PREDICTED VARIABLE. It is a variable that explains the dependent variable. It is sometimes called an experimental variable since in an experiment, this is the variable that is being manipulated or controlled to observe its effect on another (outcome variable). The level of measurement of a variable is a categorization used to illustrate the type of data or information acquired from each elementary unit. These levels or types of data were proposed by Stanley Smith Stevens in his 1946 article “ On the Theory of Scales of Measurements” • The nominal scale/measurement simply categorizes variables according to qualitative labels (or names). These labels and groupings don’t have any order or hierarchy to them, nor do they convey any numerical value. • For example, the variable “hair color” could be measured on a nominal scale according to the following categories: blonde hair, brown hair, gray hair, and so on. Variables with the weakest level of measurement are the nominal scale. These are numbers that are assigned to the objects or elements of the data set to label differences kind and thus can serve the purpose of classifying observations about qualitative variables into mutually exclusive groups. If two elements have the same nominal number, they belong to the same category. This is the only significance that nominal measurements have that is why nominal is also called categorical data or categorical variables. The ordinal scale also categorizes variables into labeled groups, and these categories have an order or hierarchy to them. For example, you could measure the variable “income” on an ordinal scale as follows: low income, medium income, high income. Another example could be level of education, classified as follows: high school, master’s degree, doctorate. These are still qualitative labels (as with the nominal scale), but you can see that they follow a hierarchical order. The next level of measurement is the ordinal measurement. These are numbers that produce a distinct ordering, ranking, or arrangement of data. The intervals between the numbers or ratios of such numbers are meaningless because they do not provide information on how much more or less of the characteristics of various items possess. Thus, the same nominal data, ordinal data, or variables cannot be added, subtracted, or multiplied. The statistical measures that may be computed at this level are the mode and median, but the mean is not defined. The interval scale is a numerical scale that labels and orders variables, with a known, evenly spaced interval between each of the values. An oft-cited example of interval data is the temperature in Fahrenheit, where the difference between 10- and 20 degrees Fahrenheit is the same as the difference between, say, 50- and 60-degrees Fahrenheit. The measure that is more complicated is the interval measurement. The interval data have all the features of ordinal measurements. The operation that can be performed in this measurement is subtraction; the differences between measurements represent equivalent intervals. Negative values can be used. Addition may be performed but not always meaningful. Their ratios are meaningless since they do not possess a meaningful origin or true zero point; the value zero is arbitrarily chosen. Thus, multiplication and division will give meaningless values. The ratio scale is the same as the interval scale, with one key difference: The ratio scale has what’s known as a “true zero.” A good example of ratio data is weight in kilograms. If something weighs zero kilograms, it truly weighs nothing—compared to temperature (interval data), where a value of zero degrees doesn’t mean there is “no temperature,” it simply means it’s extremely cold! The most used type of data is called ratio data. These data possess the characteristics of ordinal and interval data, but the values between numbers as well as the ratios are meaningful. All types of arithmetic operations can be performed with such data because these types of numbers have a natural or true zero point that denotes the complete absence of the characteristics they measure, thus making the ratio of any two such numbers independent of the unit of measurement.