0% found this document useful (0 votes)
56 views60 pages

Learning Unit 8 - 10044701

This document provides an introduction to data analysis and interpretation. It discusses both quantitative and qualitative data analysis techniques. For quantitative data, it covers descriptive statistics, frequency distribution tables, graphs, measures of central tendency, measures of variability, and relationships between variables. For qualitative data, it discusses thematic analysis, constant comparative analysis, narrative analysis, and phenomenological analysis. The overall purpose is to introduce fundamental methods for analyzing and interpreting data in the social sciences.

Uploaded by

Cherisee Alyssa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views60 pages

Learning Unit 8 - 10044701

This document provides an introduction to data analysis and interpretation. It discusses both quantitative and qualitative data analysis techniques. For quantitative data, it covers descriptive statistics, frequency distribution tables, graphs, measures of central tendency, measures of variability, and relationships between variables. For qualitative data, it discusses thematic analysis, constant comparative analysis, narrative analysis, and phenomenological analysis. The overall purpose is to introduce fundamental methods for analyzing and interpreting data in the social sciences.

Uploaded by

Cherisee Alyssa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Learning unit 8 | RSC2601

Data analysis and interpretation

TABLE OF CONTENTS

8.1. INTRODUCTION ........................................................................................................ 3


8.2. LEARNING OUTCOMES ........................................................................................... 3
8.3. DEFINING KEY CONCEPTS ..................................................................................... 4
8.4. QUANTITATIVE DATA ANALYSIS ........................................................................... 5
8.4.1. Descriptive statistics ............................................................................................ 6
8.4.2. Tables and graphs............................................................................................... 8
8.4.3. Measures of central tendency............................................................................ 18
8.4.4. Measures of variability ....................................................................................... 23
8.4.5. Relationships ..................................................................................................... 25
8.5. QUALITATIVE DATA ANALYSIS ............................................................................ 29
8.5.1. The meaning of qualitative data analysis ........................................................... 32
8.5.2. The purpose of qualitative data analysis ............................................................ 33
8.5.3. Analysing and interpreting qualitative data ........................................................ 37
8.5.4. Methods of qualitative data analysis .................................................................. 38
8.6. CONCLUSION ......................................................................................................... 54
8.7. SELF-EVALUATION ASSESSMENT ...................................................................... 56
8.8. ADDITIONAL LEARNING EXPERIENCES .............................................................. 57
8.9. OPEN EDUCATIONAL RESOURCE ....................................................................... 57
8.10. REFERENCES ......................................................................................................... 59

2
Learning unit 8 | RSC2601
Data analysis and interpretation

LEARNING UNIT 8: DATA ANALYSIS AND


INTERPRETATION

Bongani Mtshweni and Lekganyane Maditobane

8.1. INTRODUCTION

This learning unit introduces you to fundamental methods used to analyse and
interpret quantitative and qualitative data in the social sciences. The unit outlines basic
steps and techniques used to sum voluminous data into meaningful and easily
comprehendible information, emanating from a research study. In brief, we induct you
to comprehensive and essential quantitative data analysis techniques such as
descriptive statistics, frequency distribution tables, graphs, measures of central
tendency, and correlations between study variables. We also introduce you to
methods used to analyse and interpret qualitative data, which include, among others,
thematic analysis; constant comparative analysis; narrative analysis and
phenomenological analysis.

8.2. LEARNING OUTCOMES

After the completion of this learning unit/lesson, you should be able to:
• explain the meaning of data analysis
• distinguish between quantitative and qualitative data analysis techniques
• interpret basic aspects of quantitative and qualitative data
• explain the role of descriptive statistics
• use frequency distribution tables and graphs to analyse your data
• distinguish between measures of central tendency — mean, median and mode
• distinguish between measures of variability — range, variance and standard
deviation
• explain the concept of correlation in research and interpret different relationship
patterns modelled on a scatter plot
• draw up a scatter plot based on a raw data set
• explain the concept qualitative data analysis
• describe the purpose of qualitative data analysis

3
Learning unit 8 | RSC2601
Data analysis and interpretation

• identify, define and explain the various strategies for analysing qualitative data
• describe a stepwise format or plan, with appropriate examples, to analyse the
data following the various data analysis
• implement the various qualitative data analysis strategies to analyse qualitative
data, with appropriate examples in a stepwise format
• explain how qualitative data can be interpreted

8.3. DEFINING KEY CONCEPTS

• Constant comparative analysis is an iterative and inductive analysis process,


which involves the reduction of qualitative data by means of constant recording
(Fram, 2013).
• Conversation analysis is a form of qualitative data analysis that aims to
investigate social interactions as a means of producing and securing the social
order. Researchers, who work from a conversational analysis approach, are
primarily interested in the everyday situations of their research participants
(Mezmir, 2020).
• Correlational research design refers to a type of non-experimental research,
in which the researcher measures two variables and assesses the statistical
relationship (i.e., the correlation) between them (Jhangiani, Chiang, Cuttler &
Leighton, 2019).
• Descriptive statistics provide a summarised or organised information on the
characteristics of one or more datasets, which allows researchers to
understand key features of the data (Lee, 2019).
• Frequency distribution refers to a table or graph that indicates how individual
observations or scores are distributed in the measurement scale (Manikandan,
2011).
• Narrative analysis entails a form of qualitative analysis on one person’s life,
as told through many interviews and interactions in the field. The focus is on
features such as gestures, sounds and the dynamics around their speech acts,
with the ultimate purpose of understanding their biographical stories (Katz-
Buonincontro, 2022).

4
Learning unit 8 | RSC2601
Data analysis and interpretation

• Phenomenological analysis is a form of qualitative data analysis, through


which researchers investigate the phenomenon by capturing the participants’
experiences as closely as possible (Isabirye & Makoe, 2918).
• Qualitative data analysis refers to a process in qualitative research through
which researchers transforms raw data, by searching, evaluating, recognising,
coding, mapping, exploring and describing patterns, trends, themes and
categories, with the purpose of interpreting them and providing their underlying
meanings (Ngulube, 2015).
• Thematic analysis is a form of qualitative analysis seeking to develop themes
and subthemes out of the raw data (Braun & Clarke, 2006). It seeks to identify,
analyse and report patterns or themes within the data by minimally organising
and describing the data in detail (Braun & Clarke, 2006).

8.4. QUANTITATIVE DATA ANALYSIS

When conducting research, we often collect large volumes of information, called data.
The information collected is often unstructured, not coherent, logical or meaningful. In
other words, one cannot really make sense of it. To ensure that the data we collect is
meaningful and understandable, we first need to analyse or break it down in ways that
will make it comprehensible. This process is known as data analysis 1. Data analysis
refers to the process of organising the collected data, in some order or format, in order
to make meaning (Bhome et al., 2013) of the phenomenon under study. In the previous
study unit, you learnt about different ways in which we collect information (data). This
information is often presented in the form of numbers. For example: the number of
people in a survey who indicated that they have been victims of armed robbery; how
a sample of respondents rated a food product on a five-point scale (with higher scores
indicating a more positive rating); or how a group of students scored on a verbal
reasoning test. Also, a great deal of non-numerical data can be represented in a
numerical form. This involves coding or assigning certain numbers to the categories
of a variable. An example would be to code male as 1 and female as 2. Have you ever
completed a questionnaire where the code categories have already been placed on
the questionnaire? This means that, instead of just indicating that you are “male” or

1
Data analysis refers to the process of organising the information collected in a research study to make
meaning of the phenomenon under investigation.

5
Learning unit 8 | RSC2601
Data analysis and interpretation

“female”, you mark the 1 or the 2. Such a questionnaire has been pre-coded. The
reason for coding is that we need to transform our raw data into a format that can be
used in computer analyses.

We should keep in mind that data collection is not an end in itself, but forms part of a
research process aimed at answering a specific research question. In this study unit,
we will show you how you can organise numerical or quantitative data, to help you
meet this aim. When designing your research (i.e., before the data are collected), you
already need to consider what you are going to do with your data. This will ensure that
the data you collect can be analysed in a way that will provide answers to all your
research questions.

Activity 8.1

Before we discuss how to organise and analyse your data, we would like you to think
about examples from your own life (or from what family or friends have told you), where
you have provided information about yourself, whether it was information about your
feelings, attitudes, opinions, experiences, etcetera. Maybe, you completed a
questionnaire on your vocational interests at school or, perhaps, you were asked
which political party you would vote for in an election. Another example is the forms
we find in restaurants, shops, service stations, etcetera, for rating service quality. We
also often find questionnaires in magazines (e.g., on our attitude towards abortion or
our beliefs about Aids). All these are examples of data collection, since some
information was sought from you. Data collection is a process that precedes data
analysis in the steps of the research process.

8.4.1. Descriptive statistics

The original data we collect consist of lists of measurements or numbers. If it is a large


data set, it can be difficult to form an overall impression of the answer to the research
question. The procedures used to organise, summarise and visualise quantitative data
are referred to as descriptive statistics 2. These statistics help the researcher to
identify underlying patterns in the data and (if the research was done scientifically) to

2
Descriptive statistics refers to mathematical techniques used to see underlying patterns of data.

6
Learning unit 8 | RSC2601
Data analysis and interpretation

use this as evidence for his or her arguments and claims about the topic the researcher
investigated. Statistics are often used in both popular literature (magazines,
newspapers, etc.) and scientific articles to support an argument.

However, it is dangerous to regard statistics as being conclusive of some argument or


viewpoint. This is because statistics can be abused to support the researcher’s own
beliefs and values. In social science research, we therefore have certain criteria for
publishing research results. These criteria state that it should be clear in the report
how the data and the statistics, based on these data, have been obtained. Also, the
report should contain sufficient information for other researchers to interpret the
statistics and to come to their own conclusions.

Activity 8.2

The Human Sciences Research Council (HSRC) often conducts surveys on South
African social attitudes. The surveys focus on a range of issues, such as poverty,
inequality, racial redress, and service delivery, among others. In 2010, the HSRC
published a report from a survey on South African Social Attitudes (HSRC, 2010). In
one of the questions in the survey, 5 583 South Africans were asked whether their
local council became more efficient at responding to their needs over the past five
years. Of the 5 583 participants, 29% responded “Yes”, 46% responded “No”, and
25% responded “Don’t know” to the question. The results, therefore, show that the
majority of South Africans responded “No” to this question. Do you agree with the
results of the survey that local councils did not become efficient at responding to the
needs of South Africans over the past 5 years, and what are you basing your answer
on? Figure 8.1 shows a pie chart based on participants’ responses to this question.

7
Learning unit 8 | RSC2601
Data analysis and interpretation

Don’t know
Yes
25%
29%

Yes
No
Don’t know

No
46%

Figure 8.1: Local council efficiency in responding to needs in the past 5 years

From the results presented in the pie chart, a majority of South Africans (46%)
indicated that local councils did not become more efficient at responding to their needs
in the past 5 years, 25% of South Africans responded that they did not know whether
local councils became more efficient at responding to their needs in the past 5 years,
while 29% of South Africans indicated that local councils became more efficient at
responding to their needs in the past 5 years.

In the social sciences, it is important to consider the methods which were used to
conduct the study, before accepting research results. In the results presented in the
pie chart, there were only 5 583 South Africans who took part in the survey (meaning,
not all the South African population took part in the survey, but a sample). Additionally,
different responses were provided, based on the participants’ experiences with their
local councils. This means that, when interpreting research results, the researcher
needs to consider contextual and related factors (research design, sampling strategy,
participants’ characteristics, etc.) that may have an influence on the results of the study
in order to make appropriate conclusions.

8.4.2. Tables and graphs

You have seen that the researcher has to provide sufficient information and that you
need to understand the procedures that were used, before you can interpret the

8
Learning unit 8 | RSC2601
Data analysis and interpretation

statistical results. We will now work systematically through the procedures used to
compile various descriptive statistics. Being able to interpret descriptive statistics helps
you to evaluate claims more carefully, rather than blindly accepting statistical data.
You also need to be able to apply descriptive statistics if you want to summarise and
report trends in your own data.

8.4.2.1. Frequency distribution and tables

One way in which to summarise data, so that the overall pattern of the data becomes
clear, is to create a frequency distribution. Such a distribution indicates the number of
cases in a data set that obtained a particular score or that fall in a particular category
of a variable. Frequency distribution 3 is therefore the grouping of raw data. Suppose
we obtain scores on a colour awareness test for a sample of first-year engineering
students. Our data set consists of the scores for all the students in the sample. We
group this raw data (the scores) by indicating how many cases (referring to the number
of students or their scores) obtained a score of zero; how many obtained a score of 1;
etcetera. The number of cases is called the frequency of that score, or category, and
the symbol f is used to refer to frequency.

A frequency distribution can be represented by creating a frequency distribution table.


The first column of the table is an ordered list of all the possible scores or a list of the
relevant categories. We then count the number of times each score or category
occurs. To help us count, we use the second column of the table and make a tally
mark every time a score or category is observed. For every fifth mark, a line is drawn
through the previous four marks. This makes counting the number of responses
easier.

You will find that researchers usually do not include the column with the tally marks in
the final presentation of the frequency distribution table. The total frequency is written
in the third column and the sum of these frequencies (if you add them all up) should
be the same as the number of cases in the sample. The categories should be mutually
exclusive (a case cannot be classified in more than one category) and there should be

3
Frequency distribution refers to a table or graph indicating how observations are distributed.

9
Learning unit 8 | RSC2601
Data analysis and interpretation

sufficient categories so that every case can be classified into one of the available
categories.

We will use an imaginary study to illustrate tables and graphs. To make it easier for
you to understand the issues involved, we limit the number of cases in the sample.
Suppose a researcher does a study on aggression in adolescents. He/She obtained
the following information for a convenience sample consisting of 20 secondary school
students: gender (male or female) and scores (ranging from 0 to 40) on an aggression
questionnaire. Even though this is a small sample, the researcher finds it difficult to
form an overall impression of the raw data (see table 8.1). However, if he/she
organises the data according to gender (see table 8.2), they can immediately see that
more females than males were included in the study. Remember we said that this was
a convenience sample, which means that it is not necessarily representative of the
larger population. Can you see the advantage of descriptive statistics? Even though
you did not take part in the study, you can tell by looking at the table how gender was
distributed in the sample.

Gender is measured on a nominal level of measurement (levels of measurement are


discussed in study unit 7). Frequency distribution tables can also be used with ordinal,
interval, and ratio measurements. In the case of interval and ratio measurements,
there are often so many score categories that it is preferable to use a grouped
frequency table 4. This implies that the scores are grouped into so-called class intervals
that each include a series of scores. Some textbooks provide a step-by-step
explanation on how to choose the class-intervals (Aron & Aron, 1994:7–9; Huysamen,
1981:15–20; Peck, Olsen & Devore, 2015: 28-79; Van Lill & Grieve, 1990:2.9–2.12).
We will not go into such detail, but you should keep in mind that the intervals should
suit the data and there should be enough intervals to include all the data (the classes
should be exhaustive). As in the case of ungrouped frequency distribution tables (table
8.2), the classes should also be mutually exclusive (nobody should fall into more than
one class). Finally, we usually choose class intervals of equal size.

4
Grouped frequency table refers to a frequency distribution table with a limited number of categories.

10
Learning unit 8 | RSC2601
Data analysis and interpretation

TABLE 8.1
List of gender and aggression score

Learner Gender Aggression


score
Mabel Female 21
Mary Female 33
Yogan Male 20
James Male 38
Alfeus Male 25
John Male 23
Eric Male 28
Connie Female 21
Peter Male 30
Lize Female 36
Emily Female 34
Catherine Female 27
Kate Female 39
Maria Female 26
Abraham Male 35
Paulina Female 22
Pravani Female 8
Elsie Female 35
Petrus Male 24
Joanne Female 18
n = 20 students
TABLE 8.2
Frequency distribution table for gender

Gender Tally Frequency


Male |||| ||| 8
Female |||| |||| || 12
n=20

11
Learning unit 8 | RSC2601
Data analysis and interpretation

Consider the aggression scores in table 8.1. The highest value is 39 and the lowest
value is 8. If each score from 8 to 39 had to be a separate category, there would be
32 categories. This does not really help us to summarise the data. Table 8.3 is a
grouped frequency distribution of this data and you will see that the data are now
easier to interpret than the original list of aggression scores. It has been simplified and
you can contrast the number of students who obtained a low aggression score with
the number who obtained a high score. We can see that only one student obtained a
very low score (in the lowest interval), while five students obtained a relatively high
aggression score (35–41). Remember that some information is lost in a grouped
frequency distribution. For example, we can see that one person obtained a score
between 7 and 13, but we cannot infer the student’s exact score from the grouped
frequency table. One other thing that you should know about class intervals is that the
midpoint of the interval can be used to represent all the values in a particular interval.
For example, the midpoint of the interval 7–13 in table 8.3 is 10.

TABLE 8.3
Grouped frequency distribution table for aggression scores

Cumulative
Class interval Tally Frequency
frequency
35–41 |||| 5 20
28–34 |||| 4 15
21–27 | | | | ||| 8 1
14–20 || 2 3
7–13 | 1 1
n = 20

Sometimes, we are concerned not with the frequencies within the class intervals, but
with the number of scores (frequencies) “greater than” or “less than” a specified value.
The cumulative frequency (cf) 5 of a class interval is the number of cases in the
specified interval plus all the cases in the previous intervals. In other words, the
cumulative frequency (cf ) of a class interval is the number of cases that fall below the
lower limit of the next interval. For example, from the last column in table 8.3, we can
conclude that 15 students had a score lower than 35. Can you see that a cumulative
frequency distribution would not be very useful for nominal data such as the data in

5
Cumulative frequency refers to a number of scores below (or above) a certain value.

12
Learning unit 8 | RSC2601
Data analysis and interpretation

table 8.2? For cumulative frequencies to be meaningful, the order of the categories
should make sense. By the way, did you notice that the cumulative frequency of the
highest-class interval is equal to the total number of cases? Can you see why?

8.4.2.2. Percentages
We have already mentioned percentages in activity 8.2. The percentage of a category,
a score value or a class interval indicates what part of the whole sample of scores that
category, value or class interval represents. Percentage is determined by dividing the
frequency by the total number of cases (n) and then multiplying it by 100 (100%
represents the whole sample). In table 8.3, we presented the frequency and
cumulative frequencies of aggression scores. The distribution of percentages for the
same set of scores is given in table 8.4. Percentages are useful, because not only is
the number of persons in a specific category or class interval taken into account, but
so is the total number of persons in the sample. The class interval 21–27 has the
highest frequency of students (8 students) and this is therefore also the interval with
the highest percentage (40%). But, if our sample included 200 students, 8 students
would represent only 4% of the sample.

TABLE 8.4
Distribution of the percentages and cumulative percentages for aggression
scores

Class interval Percentage Cumulative percentage


35–42 25%
28–34 20%
21–27 40%
14–20 10%
7–13 5%
n = 20 students

Activity 8.3

Calculate the cumulative percentages for aggression scores and complete table 8.4
(last column). What percentage of students had a score lower than 35?

13
Learning unit 8 | RSC2601
Data analysis and interpretation

8.4.2.3. Graphic presentation of frequency distributions

In the previous section, we showed how tables can be used to represent frequency
distributions. The same data can also be presented graphically. An example is the pie
chart in figure 8.1 — that is one way of representing categorical data. An important
advantage of graphs is that they make it easier to obtain an overall impression of the
data: a graph gives you a “picture” of a set of scores. This section deals with bar charts,
histograms, and polygons. These graphs consist of a horizontal line, called the X axis
or abscissa, and a vertical line or Y axis, called the ordinate. These two lines meet at
an angle of 90 degrees. The categories or score values appear on the X axis and the
number of scores (frequencies) appear on the Y axis.
Suppose the data that we collected are measured on a nominal level of measurement;
in other words, if our measurements are in the form of categories (i.e., gender
measured as male or female; marital status measured as never married, married or
divorced etc.), we can use a bar chart6 to visualise the frequency distribution of the
data. Points on the X axis represent the categories. For each category, a bar is drawn
and the height of this bar (measured on the Y axis) indicates the frequency or number
of cases that fall within that category. Because the categories represent separate
classes, the bars in a bar chart are drawn in such a way that they do not touch each
other. Figure 8.2 is an example of a bar chart. This figure represents the distribution
of gender in the study of aggression in secondary school students and is based on the
same data as table 8.2.

6
Bar chart refers to a graph representing the frequency distribution of categorical data.

14
Learning unit 8 | RSC2601
Data analysis and interpretation

FIGURE 8.2
Bar chart for gender (n = 20 students)

Histograms are used to illustrate the frequency distribution of numerical data (data
measured on an interval or ratio level of measurement). A bar chart reflects discrete
data (e.g., data that can be counted such as the number of students in class, total
number of staff members, etc.), whereas a histogram 7 is used for continuous data
(e.g., data that can be quantifiable such temperature, weight, mass etc.). The scores
or the midpoint of each class interval are marked on the X axis and above each of this
a bar is drawn. The height of the bar, as measured on the Y axis, corresponds with
the frequency or the number of cases for that particular score or in that particular class
interval. The bars represent successive scores or class intervals and there are no
spaces between the bars. If we add up the frequencies represented by all the bars,
this will give us the total number of cases in our sample. The data in table 8.3 (class
intervals for aggression scores) have been visually presented in figure 8.3. This
histogram makes the differences and similarities between the various class intervals
apparent. For example, we can again see that only a small number of students
obtained a low score on the aggression questionnaire.

7
Histogram refers to a graph representing the frequency distribution of successive scores or class intervals.

15
Learning unit 8 | RSC2601
Data analysis and interpretation

FIGURE 8.3
Histogram for aggression scores (n = 20 students)

Rather than using bars to represent the frequencies, a mark which corresponds to the
score or the midpoint of each class interval can also be used. These marks (or
frequencies) are joined with straight lines, to draw a frequency polygon 8 that is
anchored on the X axis on both sides. In a histogram, we assume that all cases within
a class interval are uniformly distributed over the range of the interval, while in a
polygon we assume that the cases are concentrated at the midpoint of the interval.
Compare the polygon in figure 8.4 to the histogram in figure 8.3 and make sure that
you understand where the points in the polygon come from. A polygon can
accommodate more class intervals than a histogram. Smoothed polygons (the
midpoints are linked by curved lines) are frequently used to display the distribution of
scores for large data sets or populations.

8.4.2.4. Skewness and kurtosis

The distributions of data differ in terms of central location (the middle point of the
distribution) and variation (the spread of the scores around the middle point). These
properties will be explained in sections 8.4 and 8.5.

8
Frequency polygons refers to a graph in which the frequencies of class intervals are connected by straight
lines.

16
Learning unit 8 | RSC2601
Data analysis and interpretation

Distributions also differ in skewness, that is, the symmetry or asymmetry of the
distribution. A distribution can be symmetrical — that is, it can have the same shape
on both sides of the middle point (i.e., the left and right sides are mirror images of each
other). If a distribution is asymmetrical and the larger frequencies are concentrated
towards the low end, it is said to be positively skewed (i.e., long tail towards the right
side). If the larger frequencies are concentrated toward the high end of the variable,
the distribution is negatively skewed (i.e., long tail towards the left side). Skewness is
illustrated in figure 8.5. Note that smooth curves are used. Whenever we deal with
large populations, we prefer to represent our frequency distributions as smooth curves.
We have already referred to this when we talked about smoothed frequency polygons.

FIGURE 8.4
Frequency polygon for aggression score (n = 20 students)

17
Learning unit 8 | RSC2601
Data analysis and interpretation

FIGURE 8.5
Three frequency distributions differing in skewness

The kurtosis of distributions refers to the flatness or peakedness of the distribution. A


symmetrical bell-shaped distribution is known as a normal distribution. In terms of
kurtosis, this distribution is mesokurtic. A more peaked distribution is called leptokurtic,
while a flatter distribution is platykurtic. Kurtosis is illustrated in figure 8.6.

FIGURE 8.6

Three frequency distributions differing in kurtosis

8.4.3. Measures of central tendency

We have seen that tables and graphs can be used to summarise data. It is also
possible to use single values to summarise the data obtained from a sample and to
describe the characteristics of the frequency distribution. Researchers often want to
know which score or value is central to a distribution and which can, therefore, be used
to summarise the entire distribution. A score or value which represents all the scores
in the sample is called a measure of central tendency. We will discuss three measures
of central tendency, namely the mode, the median and the mean.

18
Learning unit 8 | RSC2601
Data analysis and interpretation

If there are relatively few scores, it is easy to determine the mode, without using tables
or graphs. In the case of a large sample of scores, it might be easier to arrange the
scores in ascending or descending order or to work with frequency distributions. The
mode 9 is the score value with the highest frequency. For example, in the list, 23 26
28 37 37 37 45 48 49, the score that occurs with the highest frequency (three times)
is 37 and this is regarded as the mode. None of the other scores in this list occurs
more than once.
If two or more successive scores in a sample all have the highest frequency, the
average (this term will be explained later on in this section) of those scores is taken
as the mode of the distribution. However, if two values that do not follow on each other
both have the highest frequency, the sample has two modes. Such a distribution is
called bimodal (compared to a unimodal distribution with a single mode). If, for
example, the list that we gave you was, 23 26 28 37 37 37 45 48 49 49 49, there would
have been two score values that occurred three times and the distribution would be
bimodal.

If a distribution has two or more modes, these modes do not give a good indication of
the central tendency of the sample as a whole. In the case of a grouped frequency
distribution, the mode is equal to the midpoint of the class interval with the highest
frequency. A graphical representation of the distribution makes it easy to identify the
mode, since the class interval with the highest frequency will stand out above the
others. Take a look at the bar chart in figure 8.2. The mode in this example is the
category “female” and we therefore concluded that this was the largest category.

The mode is the only measure of central tendency that can meaningfully be used for
nominal data. If we are dealing with categories (e.g., different types of illnesses), it
does not make sense to order the types of illnesses and neither do the illnesses have
numerical values. Only the frequency of occurrence of each category is taken into
account when calculating the mode.

9
Mode refers to a score in a sample of scores that occurs with the greatest frequency.

19
Learning unit 8 | RSC2601
Data analysis and interpretation

To work out the median of a sample of scores, we first have to arrange the scores in
ascending or descending order. The median 10 is the value which falls right in the
middle of the list; in other words, half the scores in the sample fall below the median
and the other half above it. It is therefore the midmost score, that is, the score below
which 50% of all the scores fall. If the number of scores is an odd number, the median
is simply the score in the middle of the list. When the number of scores is an even
number, the middle of the list falls between two values and the median is the average
of these two scores. If several scores with the same numerical value occur near the
median (called tied scores), you will still use the position of the scores, after they have
been ordered, to determine the median. For example, in the list, 23 26 28 37 37 37 45
48 49, the score corresponding to the middle rank is 37 and this is regarded as the
median. In the case of a large sample, where the scores have been represented in a
frequency distribution, the median is calculated by means of a formula for grouped
data. This formula is also recommended in some cases where tied scores occur in a
list of scores, but these calculations do not form part of this module.

Both the mode and the median can be used with ordinal data, but the median is
preferred, because it takes into account the frequencies and the rank order of scores.
Suppose that the suburbs in the town or city where you live are ranked according to
density (the number of people living there). Low density is allocated the rank of 1,
average density 2, and high density 3. If ten suburbs are ranked 1, eight are ranked 2
and nine are ranked 3, then the mode 1 indicates the category with the highest
frequency. However, we cannot necessarily conclude that most of the suburbs were
low in density. If the set of ranks for all the suburbs are arranged in ascending order
(first all the ones, then all the twos, etc.), the middle value in this set (the median)
would be 2. At least half of the sample or 50% of the suburbs are therefore average
or high density (the scores in our list that fall above the median).

The mean11 of a sample of scores is the arithmetic midpoint of the scores and
represents all the scores in the sample. To calculate the mean, we add up all the

10
Median is a value or score such that half the observations fall above it and half below it.

11
Mean refers to a sum of a sample of scores divided by the number of scores in the sample.

20
Learning unit 8 | RSC2601
Data analysis and interpretation

scores and divide it by the total number of scores in the sample. We use the symbol x
to refer to the raw scores in the distribution of the variable x. As we already know, the
symbol n stands for the number of scores in the distribution.

The n measurements in a sample of scores are thus represented by the symbols, x1,
x2, x3, ..., xn. The formula for the mean is,

x̄ = x1 x2 x3 + ... + xn
n
and this can also be written as:

x̄= ∑x
n

In this formula x̄ (pronounced x-bar) is the mean, ∑ means summate (or add up), x is
each raw score, and n is the sample size (the number of people in the sample).
Everything above the line (i.e., the sum of all scores) should be divided by everything
below the line (i.e., the number of scores in the sample). You are not expected to
memorise the formula, but being able to calculate the various statistics gives you a
better understanding of these statistics.

It is also possible to calculate the mean by using a frequency distribution. This might
be necessary if we are working with a large sample of scores. Each value of the
variable x is multiplied by the number of times it occurred (the frequency) and these
products are added together and divided by the total number of measurements. In the
case of a grouped frequency distribution, the midpoint of each interval may be used to
represent all values falling within the interval.

All three measures of central tendency can be used in the case of interval and ratio
data, but the mean is usually chosen. When calculating the median, the particular
values of the variable are not taken into account, but only the occurrence of the values
above or below the middle value. Two studies on stress in executives were conducted
in different organisations and the following scores were obtained (the maximum
possible score on the stress questionnaire is 60). Study 1: 9 11 17 20 23 25 28; Study

21
Learning unit 8 | RSC2601
Data analysis and interpretation

2: 11 14 18 20 48 52 54. These samples of scores have the same median, but this
does not indicate that some of the executives in the second organisation have high
levels of stress that could influence their ability to do their job. The mean for these two
samples of scores would be 19 and 31, respectively, indicating that, in the case of the
second organisation, it might be necessary to pay more attention to the stress levels
of executives.

Because all the values of the variable are used to calculate the mean, this is a more
appropriate measure of central tendency for interval and ratio data. The mean can be
used in mathematical calculations, whereas the mode and the median cannot. The
mean is also a more accurate and stable estimate of the population mean than the
other measures of central tendency. However, if there are one or two scores that differ
a great deal from the rest of the scores, this will influence the mean and the median is
then preferred. Remember, we called such a distribution skewed (refer to figure 8.5).
The mean, mode, and median of a symmetrical frequency distribution will coincide.

Activity 8.4

When doing research, you will often have to decide what is an appropriate method to
present a particular data set. Which measure of central tendency do you think will be
best at showing household income in South Africa, and what is the rationale for your
answer?

In answering this question, you need to think about what the central measures of
tendency symbolise in practice. We have said that the mean is an appropriate
measure of central tendency for interval and ratio data; you might think that the mean
will best reflect household income. However, in most countries (including South Africa)
household income is positively skewed, meaning that far more people earn less money
than the mean, rather than the other way round (i.e., earn more than the mean).
Because there are a few extremely rich people, their income makes the mean higher
than most people’s income. The median is therefore a better indication of average
income in a country.

22
Learning unit 8 | RSC2601
Data analysis and interpretation

8.4.4. Measures of variability

You now know that a sample of scores can be summarised and described by using
central values, such as the mode, median and mean. These values represent all the
scores in a sample. However, these central values do not indicate the extent to which
the scores in the sample differ from each other and how far they deviate from the
central value. The degree to which scores in a sample differ, that is, how spread out
they are, is called the variability of the scores12.

The simplest measure of variability is the range. In any sample of scores, the range is
taken as the difference between the highest and lowest scores. The range is a
measure of variability of scores in a sample, because it indicates the range of the
distribution of scores from the lowest to the highest. In the example, we used on stress
in executives, the range for Study 1 is 19 (28 [the highest score] minus 9 [the lowest
score]), and for Study 2 it is 43 (54 minus 1). The scores in the second study clearly
exhibit greater variability (they are more scattered) than those in the first study. This
set of scores, consequently, has a much greater range and, again, this is a warning
that some of the executives in the second company are experiencing much more
stress than others (do you remember that this organisation also had the higher mean
stress score?).

A disadvantage of the range of the distribution as a measure of variability is that it is


calculated by using only two of the scores in the sample of scores; the other scores
are ignored. Another approach we can use is to determine the degree to which each
score differs or deviates from the mean of the sample (which is based on all the scores
in the sample) and use this as an index of the variability of the scores in the sample.
The measures of variability most often used, namely variance and standard deviation,
are based on this approach, that is, where the average difference between each score
and the mean is used to express the variability of a sample of scores.

The mean is the index of central tendency that best represents all the scores in the
sample. If we determine the extent to which each score in the sample differs from the

12
Variability of the scores refers to the extent to which scores differ from each other or how spread a group of
scores are in a frequency distribution.

23
Learning unit 8 | RSC2601
Data analysis and interpretation

mean, then we have an indication of the extent to which all the scores differ from each
other (the variability). We could therefore determine variability by subtracting the mean
of the sample from each raw score in the sample. We call this difference a deviation
score (represented by x – x̄ for each value of x). This score indicates the extent to
which each raw score deviates from the mean. To determine variability, we could add
up the deviation scores, but some of the deviations about the mean are positive and
some are negative — this means that the sum of deviations is therefore zero. The sum
of the deviation scores below the mean cancels out the sum of the deviation scores
above the mean. One method for getting rid of the negative values obtained from the
deviations below the mean is to square the deviations from the mean before we add
them up. This means to multiply each deviation with itself. The variance 13 of a sample
of scores is calculated by dividing the sum of the squared deviation scores by the
number of scores to obtain an average of the squared deviation scores. The formula
for this statistic is,

s² = Σ(x – x̄)²
n–1

In this formula, s² is the variance, ∑ means to sum, x is each raw score, x̄ is the mean,
and n is the sample size. Note that we divide the sum of the squared deviations by n–
1, instead of n, in order to obtain the “mean”. If we are working with a sample of scores,
the sample variance is an estimator of the population variance; a more accurate
estimate is obtained when n–1 is used as divider. The explanation for this forms part
of inferential statistics, but this is not covered in this module.

The variance is a statistic in squared units. However, we would like to interpret the
meaning of the variability of a set of scores in terms of the original units of
measurement. We therefore calculate the square root of the variance and this is known
as the standard deviation 14 of a sample of scores:

13
Variance is a measure of variability based on the deviation of each score in a distribution from the mean of
that
distribution

Standard deviation is an index of variability that is expressed in the same units as the original
14

measures.

24
Learning unit 8 | RSC2601
Data analysis and interpretation

s = √¯s²

Both the variance and the standard deviation of a sample of scores indicate the
average extent to which scores in a distribution differ from one another. Because the
standard deviation is expressed in the same units as the original measure, researchers
prefer to use this statistic.

Activity 8.5

Suppose you are doing a study on social support for prisoners and their families. One
of the variables you are interested in is the number of years people spend in prison
(the term of imprisonment). On average, prisoners spend four years in prison and the
variation in the number of years spent in prison is indicated by the standard deviation
of three years. Explain why standard deviation is a better index of variability than
variance in this study.

8.4.5. Relationships

Until now, we have been discussing a single variable. In previous study units, however,
research studies have referred to two or more variables and the relationship between
these variables. We will briefly consider the direction and strength of the relationship
between two variables. If there is a relationship between two variables, it means that
a person’s position on one variable is related to his or her position on the other
variable.

A direct or positive relationship means that relatively high scores on one variable are
associated with relatively high scores on the other and relatively low scores on the first
correspond with relatively low scores on the second. An inverse or negative
relationship means that high scores on one variable correspond with low scores on
the other variable. If the variables are not related, changes on the one variable do not
correspond with changes on the other.

We refer to the statistical relationship between two variables as a correlation and the
statistic used to describe this is called a correlation coefficient. It can range in value

25
Learning unit 8 | RSC2601
Data analysis and interpretation

from –1,00 to +1,00. These values represent a perfect negative (–1) or a perfect
positive correlation (+1). A value close to 0 indicates a weak relationship, while 0
means there is no relationship. We can see that the numerical size of a correlation
coefficient indicates the strength of the relationship, while the sign (positive/negative)
indicates the direction of the relationship. A positive correlation means that an increase
in one variable is associated with an increase in the other. A negative correlation
between two variables means that as the value of one variable increases, the value of
the other one decreases. Please note that the correlation between two variables does
not necessarily mean that one variable causes the other.

Essentially, a correlation 15 in research is a concept that derives from a correlational


research design, which is a research method used to examine the relationship
between two or more variables in a study. Testing for a correlation can include two or
more variables, depending on the objectives of the study. The primary goal of testing
for a correlation is to establish the association and extent of the relationship between
the variables being studied. When testing for a correlation, we are only interested in
describing the nature of the relationship between variables and make no attempt to
explain the observed relationship nor the underlying mechanisms responsible for the
observed relationship.

In correlational research studies, the researcher measures two or more variables for
everyone in the sample. The set of scores from the variables that the researcher is
interested in is correlated to establish if there is a relationship between the variables.
For example, a researcher may be interested in examining the relationship between
variable X (happiness) and Y (academic performance) in a sample of six students.
This means that the six students would have to be measured on variables X
(happiness) and Y (academic performance). The scores obtained from each student
participant are used to determine whether a relationship exists between happiness
and academic performance. Table 8.5 shows hypothetical scores obtained by a
researcher who was interested in determining whether there is a relationship between
happiness and academic performance among university students.

15
Correlation is the examination of the relationship between two or more variables.

26
Learning unit 8 | RSC2601
Data analysis and interpretation

Table 8.5
Happiness and academic performance scores
Student participant A B C D E F
Happiness (X) 3 6 8 13 16 20
Academic performance (Y) 4 6 10 14 18 24

The observation made from the results, presented in table 8.5, is that low X scores are
associated with low Y scores and high X scores are associated with high Y scores. In
practice, this means that as happiness increases, the academic performance also
increases and, as happiness decreases, academic performance also decreases. This
indicates a positive relationship between happiness and academic performance.
Contrary, a negative relationship would have been observed if high scores in
happiness were associated with low scores on academic performance or high scores
on academic performance associated with low scores in happiness.

In research, we may use graphical representations to present our results. One such
graphical representation is a scatter plot, which can be used to model the relationship
between variables. A scatter plot 16 allows for individual scores to be represented
graphically and to demonstrate how variables relate to one another. In other words,
the scatter plot accommodates the X and Y scores from each individual participant
and maps them out, to identify a relationship between two variables. As an example,
we use the scores from table 8.5 to draw a scatter plot depicting the relationship
between happiness and academic performance in figure 8.7.

16
Scatter plot: a graphical representation modelling the relationship between two variables.

27
Learning unit 8 | RSC2601
Data analysis and interpretation

30

25
F

Academic performance (Y) 20


E
15
D

10 C

B
5
A

0
0 5 10 15 20 25
Happiness (X)

Figure 8.7: Scatter plot demonstrating the relationship between happiness and
academic performance

The scatter plot demonstrates the X and Y scores from student participants, A, B C,
D, E, and F. The scatter plot also indicates a positive relationship between happiness
and academic performance, as shown in the dotted line, with a slope from the bottom
left to the top right of the graph. This means that an increase in happiness is associated
with an increase in academic performance. If, however, there was an increase in
academic performance scores and a decrease in happiness scores, a negative
relationship would be observed. Therefore, the dotted line would have a slope crossing
from the top left to the bottom right of the graph. In the social sciences, researchers
seldom find a strong relationship between variables that nearly resembles a straight
line, such as the one depicted in figure 8.7. A relationship between variables,
expressed through a straight line, is called a linear relationship17. A straight line
slope, crossing from the top left to the bottom right, indicates a negative linear
relationship, whereas a straight line slope, crossing from the bottom left to the top right,
indicates a positive linear relationship. The graphical representations of linear
relationships are illustrated in figure 8.9.

17
Linear relationship is a positive relationship between two variables modelled by a straight line.

28
Learning unit 8 | RSC2601
Data analysis and interpretation

30
30
25
25
20
20
15
15
10
10
5
5
0
0
0 10 20 30
-30 -25 -20 -15 -10 -5 0

(a) (b)
Figure 8.9: Scatter plot (a) depicts a negative linear relationship with a
correlation coefficient of (-1) and scatter plot (b) depicts a positive linear
relationship with a correlation coefficient of (+1)

Activity 8.6

Mr Stan, a high school principal, gave a report to the members of the School Governing
Body (SGB) that there was a sharp increase in the number of bullying cases reported
to his office in the previous year. As a result, learners’ marks for the previous year also
dropped significantly. As a researcher, how would you characterise the relationship
between the bullying cases reported and the drop in learners’ marks?

8.5. QUALITATIVE DATA ANALYSIS

In qualitative research, context always matters. The meaning of qualitative data


analysis should therefore be understood in the broader context of qualitative research;
hence, we deemed it essential to begin this section by recapping on the meaning of
qualitative research. One of the classical definitions of qualitative research is the one
provided by Strauss and Corbin (1998:10-11), who consider qualitative research to be
a research method seeking to produce findings reached by means other than
quantification of any statistical procedures. As described by Katz-Buonincontro (2022),
qualitative research is a humanistic, holistic and interactive approach. Qualitative
researchers study participants in their natural settings, interpret events based on the
meanings that they ascribe to such events and adopt an open-minded approach to
data interpretation (Katz-Buonincontro, 2022; Yegidis, Weinbach & Myers, 2018).
They directly visit the natural environment in which participants experience

29
Learning unit 8 | RSC2601
Data analysis and interpretation

phenomena of their interest and study it as completely as they can, to develop a


comprehensive understanding thereof (Babbie, 2016). The following are some of the
key features of qualitative research, as identified by Anastas in (Yegidis et al., 2018):

• The methods and procedures used in qualitative research are flexible and
responsive to the research findings as they emerge.
• Qualitative researchers collect relatively unstructured data to describe the
phenomena under investigation, based on words or conducts of the
participants.
• The interest of qualitative researchers is not narrowly limited to the phenomena
of interests. They also pay attention to the natural contexts in which such
phenomena occur, as well as their decisions, as the study progresses.
• The study is so broad to an extent that it includes an analysis of both the
subjective experiences of the researchers and the participants.
• Qualitative researchers often recommend the location of a qualitative study
within a particular epistemological tradition, namely post-positivism,
pragmatism, phenomenology, interpretivism or constructivism, and critical,
normative science.

Additional features of qualitative research, as provided by Katz-Buonincontro (2022),


are as follows:
• Instead of observing people in a controlled environment or experimental
laboratories, qualitative researchers are interested in studying people in their
natural settings.
• Based on the above feature, qualitative research is therefore a humanistic,
holistic and interactive research approach.
• Researchers, who adopt qualitative research, use inductive logic, which
enables them to focus on emergent and iterative collection of data.
• Inductive reasoning, adopted by qualitative researchers, involves allowing the
research questions, themes and codes to bubble up to the surface, as they
occur, as opposed to hypotheses that narrow down the expected causal
relationships among the study variables.

30
Learning unit 8 | RSC2601
Data analysis and interpretation

• Qualitative researchers learn the skill of balancing subjectivity with objectivity


during data collection and interpretation, since objectivity and subjectivity are
essential mutual reinforcers seeking to strengthen the quality of the study.

Now that we know what qualitative research is all about, we are shifting our focus to
the main business of this section of our study unit, which is qualitative data analysis.
Before we get deeper into the focus of this section, it is essential to set a tone by
borrowing from Flick’s (2014) emphasis on the significance of data analysis:

Data analysis is the central step in qualitative research. Whatever the


data are, it is their analysis that, in a decisive way, forms the outcomes
of the research.

Flick’s emphasis is that, without data analysis, researchers will never realise the
study’s desired outcomes. This emphasis is particularly crucial in view of Lloyd’s
(2014) argument that data analysis does not only happen after the data collection, as
many would like to believe. It is rather a process which is conducted in two phases:
firstly, during the early stages of preliminary literature review, researchers analyse and
evaluate literature with the purpose of understanding the field, their specific research
topics and identifying any existing gaps in literature. Secondly, researchers conduct
data analysis of the data that has been collected for their research project (Lloyd,
2021).

This then takes us to the next essential point regarding data analysis. It is essential to
remind you that qualitative research studies are conducted for various purposes,
including to answer a particular research question through the data, which is either
collected by reviewing existing literature or documents, observing people as they
engage in their normal activities, or interviewing people either individually or in a group
discussion, or even by analysing pictures and sketches. The material that has been
collected from such literature, documents, interviews, group discussions or pictures is
called the data and, once collected, it must be analysed to ascertain its meaning in the
context of the research purpose and/or questions. Given the various forms of
qualitative data collections, qualitative data can also be in various forms. According to

31
Learning unit 8 | RSC2601
Data analysis and interpretation

Igatu (2009), qualitative data can be in a form of a structured text (writings, stories,
survey comments, news articles, books etc.); unstructured text (transcription;
interviews; focus groups; conversations); or audio recordings, music and video
recordings (graphics, art, pictures, visuals).

8.5.1. The meaning of qualitative data analysis

Before we take you through the actual process of qualitative data analysis, it is
important to first help you to understand the meaning of data qualitative analysis.
There is no universally accepted definition of the term, qualitative data analysis;
different definitions of the term have been developed by researchers and authors.
Fram (2013), for instance, defines qualitative data analysis as a variety of practices
and procedures through which researchers move from the raw qualitative data that
have been collected, to some kind of explanation for easy understanding or
interpretation of the meanings and situations of people who are part of the
investigations. In another definition, Mezmir (2020) refers to qualitative data analysis
as a process through which researchers classify and interpret linguistic (or visual)
material, to make statements about implicit and explicit dimensions and structures,
through which meanings are created in the material and its representations. Moule
(2021), for instance, considers data analysis as an act of processing, summarising
and interpreting raw data into meaningful information. By processing, Ibrahim (2015)
refers to the recasting and dealing with data in such a way that it is ready for analysis.
For Ibrahim (2015), data analysis entails closely related operations, performed with
the purpose of summarising the collected data and organising it in such a way that it
yields answers to the research questions. A more comprehensive definition of
qualitative data analysis is the one provided by Flick (2014), who defines it as follows:

Qualitative data analysis is the classification and interpretation of linguistic (or


visual) material, to make statements about implicit and explicit dimensions and
structures of meaning-making in the material, and what is represented in it.
Meaning-making can refer to subjective or social meanings. Qualitative data
analysis also is applied to discover and describe issues in the field or structures,
and processes in routines and practices. Often, qualitative data analysis combines
approaches of a rough analysis of the material (overviews, condensation,

32
Learning unit 8 | RSC2601
Data analysis and interpretation

summaries), with approaches of a detailed analysis (elaboration of categories,


hermeneutic interpretations or identified structures). The final aim is often to arrive
at generalisable statements by comparing various materials or various texts or
several cases.

If you pay closer attention to these definitions, you will notice that, despite having been
coined by different authors and researchers, they still share common purpose, which
is to systematically reduce volumes of data into small manageable ones for the
purpose of answering the research questions. Qualitative data analysis involves
collecting and patching together all pieces of data, to develop a broader understanding
of its meaning.

8.5.2. The purpose of qualitative data analysis

The overall purpose of qualitative data analysis is to develop structure and meaning
from the collected data (Lloyd, 2021). You will remember that the data, which has been
collected through interviews or group discussions, for instance, will be voluminous and
will not necessarily be structured. It will generally be difficult, if not impossible, to derive
any meaning from such kind of data. Therefore, data analysis will assist in structuring
it, so that it can ultimately have meaning. After all, collecting such material, without
interrogating and interpreting it in line with the study purpose or questions, would
render the entire process futile.

Three other aims of qualitative data analysis, as identified by Flick (2014), are (1) to
describe the phenomenon under investigation in greater detail; (2) to identify the
conditions on which the differences and commonalities between the cases (the
individuals or groups under investigation) can be derived; and (3) to develop theory
from the phenomenon under investigation by analysing empirical material. For Mezmir
(2020), qualitative data analysis serves three main aims, namely

(a) to describe the phenomenon in greater detail. This can for instance take a form
of explaining subjective lived experiences of the participants.
(b) to explore the conditions on which existing differences are based, by looking
for explanations of the observed differences.
(c) to develop theory of the phenomenon under investigation, from analysis of the
empirical material.

33
Learning unit 8 | RSC2601
Data analysis and interpretation

Suppose the researcher asked the question: “What is the purpose of using headsets
while studying?” In view of the various forms of data collected through various
methods of data collection, such a question may clearly require the researcher to get
students to participate in an interview setting, to answer a set of questions, the
answers of which will ultimately answer this research question. Alternatively, the
researcher might simply visit libraries, where students are studying, and observe them
as they study to see how they manage to do so with headsets on their ears. The
researcher may also watch videos or simply read through literature around the subject.
Whatever method the researcher chooses, they will ultimately collect voluminous data,
which should be analysed. Such data will consist of different views expressed by the
students or different notes from observations, depending on the method used to collect
the data, some of which may not make any sense when considered in isolation.
He/She will have to identify some patterns from such voluminous data and connect
them to develop meaning in the context of the posed questions. Such an exercise is
what data analysis seeks to achieve. Data analysis is more like building a puzzle out
of many pieces of different patterns and colours. The volume of the data that will be
analysed can be compared to assorted pieces of a puzzle, which are patched
together, to create the bigger picture that, ultimately, makes sense. For the bigger
picture to appear accurate, relevant pieces and correct colours must be correctly
positioned in their respective spaces. In such a way, the reader can ultimately get an
understanding of what the pieces are and how they connect to create the bigger
picture. Let’s take a moment and investigate another example. Suppose a researcher
conducts a qualitative study, seeking to understand what the participants’ experiences
were regarding a giant animal which has recently escaped from the Kruger National
Park in the middle of the night and is now roaming around in the communities. A
qualitative researcher might, in this instance, ask the participants the question: “How
would you describe your experiences during that night”?

Suppose participants share their experiences by describing various aspects, such as


the sound of such an animal, the footprints, the smell, the direction from which it was
coming, some parts of its body and so on. Some of your participants may go as far as
to just describe their feelings, such as being scared and shocked. Clearly, not all their
descriptions will be useful in creating meaning to their experiences regarding this

34
Learning unit 8 | RSC2601
Data analysis and interpretation

animal. As part of analysis, the researcher should, ultimately, connect all those pieces
of description, so that it can make sense to the reader. Some of the pieces may not
really be useful, depending on what you want to achieve (your research aims and
questions). Figure 9.1 below illustrates data analysis as a metaphoric puzzle, whereby
each of the pieces are connected to create an image that makes sense, that is,
comprehensive understanding.

Figure 8.10: An Illustration of data analysis as a puzzle construction exercise

From this figure, you can see that each piece of the puzzle has a specific role to play
in each respective space, both in giving colour and shape to the bigger structure. Even
in data analysis, each piece of material has a role to play in giving meaning to the data,
in the context of the posed research question(s).

35
Learning unit 8 | RSC2601
Data analysis and interpretation

Activity 8.7

Take a moment and look at the following picture and try to answer the questions
that follow.

( Source: Rae, n.d.)

Looking at the four individuals, what do you think they are doing? What do you think
about the stickers placed on the wall? What could be written thereon? In your view,
what is the whole exercise about?

In your attempt to answer the above questions, you would have noticed that this
exercise required subjective interpretation of the picture in the context of the subject
line of this study unit, which is data analysis. The first question required you to share
your thoughts regarding what the people on the picture are doing. Given the context
of the study, one might say they are manually engaged in data analysis. You can
assume this, because you see each one trying to sort the stickers or placing them in
a certain order. It could be that they are clustering common data materials together.
The next question required you to share your thoughts on contents of the stickers.
What do you think is on the stickers? It could be some data codes that were developed
from the raw data, which, as we explained earlier in our discussion, is more

36
Learning unit 8 | RSC2601
Data analysis and interpretation

comprehensive and voluminous. The stickers could have some labels or codes that
are used to identify certain data sets, with the eventual aim of creating meanings from
it. The entire exercise seems to be illustrative of a manual process for qualitative data
analysis.

Now that you can locate qualitative data analysis within the broader context of
qualitative research, and to even define the concept of data analysis and explain its
purpose, it is essential to build further on this knowledge by practically analysing and
interpreting qualitative data. It is the purpose of this section to further capacitate you
in this regard, through lessons on how to analyse qualitative data.

8.5.3. Analysing and interpreting qualitative data

In setting the tone for data analysis, it is essential to begin with the insights penned by
Lloyd (2021), as quoted below:

Good analysis requires strong inductive analytical skills and good deal of creativity
(making connections across the data) in order to identify patterns and weave these
together in a meaningful and insightful way. Reporting qualitative analysis is also
tricky, because of the concise and structured way reporting is conducted;
subsequently, researchers need to make decisions about which aspects to include
and what to leave aside.

The above extract captures the essence of qualitative analysis, interpretation and
reporting. From what Lloyd is saying, one gets a sense that researchers are confronted
with comprehensive data, from which some patterns should be created, through
linkages, to develop meaningful understanding. During this process, researchers are
confronted with a huge challenge, which requires a very strong analytical mind, with
the capacity to sort relevant data from that which is irrelevant, as they strive to create
meaningful knowledge. Although qualitative researchers prefer manually analysing
such voluminous data, technological advancement also play a crucial role in qualitative
data analysis, with software programmes such as NVIVO and AtlasTi, among others,
being the common strategies (Lloyd, 2021). For the purpose of this module, we will
focus on manual analysis. This, however, does not mean that technologically powered
analysis is discredited or not so important. No. The main reason is that we belief that
for you to have a well-grounded basic knowledge, it is essential to begin with the

37
Learning unit 8 | RSC2601
Data analysis and interpretation

manual approach to data analysis. You will learn further about technologically driven
analysis as you advance your studies.

Qualitative research is not a mechanical process, it is rather a robust and flexible


process, which goes deeper into the meaning that underpins the data. An observation
made by Braun and Clarke (2006) is that analysing qualitative data is a back-and-forth
process, with the researcher moving back and forth between the data set, the coded
extracts of the data and the products of such analysis. The techniques used in
qualitative data analysis, as described by Jones, Torres and Arminio (2022), enable
researchers to notice and identify descriptive, common or unusual ideas that are
communicated by the participants in phrases or words, and to attach them to the
broader meaning of the phenomenon. The researcher breaks down the data into parts
and sub-parts, before rearranging it into a new whole (Jones et al., 2022). Keestra,
Uilhoorn and Zandveld (2022) are of the view that the first step in data analysis is to
check the raw data, to determine its suitability, since some of these raw data may
simply be false or useless. Central to data analysis, as noted by Braun and Clarke
(2006), is writing. The researcher should immediately begin by jotting down some
ideas and potential coding schemes, and continue doing so throughout the analysis
process.

8.5.4. Methods of qualitative data analysis

As much as there are diverse definitions of the concept data analysis, there are also
several strategies for analysing data (Morse, 2020; Roulston, 2022), which include the
following: constant comparative analysis; phenomenological analysis; conversation
analysis; video analysis; content analysis; electronic analysis; narrative analysis and
discourse analysis (Maxwell & Chmiel, 2014; Mezmir, 2020; Roulston, 2022). Given
the scope of this section, we will not address each of these approaches beyond merely
describing them. You will learn more about them further in your studies in social
science research. Below, we provide only a brief explanation of each of the analysis
strategies. Our focus will be on thematic analysis, which is also explained further
below:

38
Learning unit 8 | RSC2601
Data analysis and interpretation

8.5.4.1. Electronic analysis


With the proliferation of technological advancement, the computer-generated
qualitative data analysis, electronic analysis became essential in our modern era of
digitisation. Qualitative data analysis, generated by computers, have a major influence
on how analysis is done and will continue to do so even in the future (Maxwell &
Chmiel, 2014). Some forms of computer-driven software for analysing qualitative data
are: Atlas ti 6.0 (www.atlasti.com); HyperRESEARCH 2.8 (www.researchware.com);
Max QDA (www.maxqda.com); The Ethnograph 5.08; QSR N6
(www.qsrinternational.com); QSR Nvivo (www.qsrinternational.com); Weft QDA
(www.pressure.to/qda); and Open code 3.4 (www8.umu.se.).

8.5.4.2. Narrative analysis

Narrative analysis is a type of analysis focusing on one person’s life, as told through
many interviews and interactions in the field. The focus is on features such as
gestures, sounds and the dynamics around their speech acts, with the ultimate
purpose of understanding their biographical stories (Katz-Buonincontro, 2022).
Another definition of narrative analysis is the one provided by Ntinda (2018), who holds
that narrative analysis refers to several procedures used to interpret the narratives
generated through research. Two forms of narrative analysis are formal structural
analysis and functional analysis. Formal structural analysis refers to a form of analysis
involving the exploration of how the story is structured, developed and its beginning
and ending. Functional analysis, on the other hand, entails the functionality of analysis:
what the narrative is doing or what the participant is conveying through the story
(Ntinda, 2018).

8.5.4.3. Constant comparative analysis


A classical definition of comparative analysis is the one provided by Glasser and
Strauss (cited in Fram, 2013). They consider constant comparative analysis as an
iterative and inductive analysis process, which involves the reduction of qualitative
data by means of constant recording. Researchers begin with open coding, which then
develops categories from the first round of data reduction, and then further reducing
and recording, in order to allow the emergence of potential core categories (Fram,
2013).

39
Learning unit 8 | RSC2601
Data analysis and interpretation

8.5.4.4. Phenomenological analysis


A phenomenological analysis of qualitative data involves the capturing of the way the
participants experienced the phenomenon under investigation as closely as possible
(Isabirye & Makoe, 2918). In analysing the data through phenomenological approach,
researchers get involved in a stepwise, rigorous procedure, involving the segmentation
of raw data into meaningful units, which are then restructured in terms of the meaning
clusters. Such clusters are then translated into scientific language, which is consistent
with their central meaning and the constituent themes that are common to all accounts
of the participants, as synthesised into a coherent description of an experience under
investigation (Isabirye & Makoe, 2018).

8.5.4.5. Conversation and discourse analysis


Conversational analysis is a qualitative data analysis strategy which is dedicated to
the investigation of social interactions as a mean of producing and securing social
order. Its primary interest is the formal analysis of everyday situations (Mezmir, 2020).
Discourse analysis is grounded on the constructionist theoretical approaches. Its focus
is on how researchers can study the creation of social reality in discourses (Mezmir,
2020).

8.5.4.6. Video analysis


You will recall that qualitative data can also be in the form of a video. This means,
video analysis is also one of the methods through which qualitative data can be
analysed. There are two forms of video analysis: standardised video analysis and
interpretive analysis images (Knoblauch, Tuma & Schnettler, 2014). Standardised
analysis involves coding video segments in terms of the predetermined coding
schemes, based on theoretical assumptions (Knoblauch et al., 2014). In terms of the
interpretive video analysis, the approach involves an assumption by the researcher
that the actions that are recorded are guided by meanings that can be properly
understood by the participants themselves. In other words, researchers, who are
inspired by the interpretive approach to video analysis, tend to base their analysis on
the participants’ interpretations of their meanings (Knoblauch et al., 2014).

8.5.4.7. Content analysis


Content analysis is a qualitative analysis procedure, which involves the categorisation
of verbal or behavioural data, with the aim of classifying, summarising and tabling it.

40
Learning unit 8 | RSC2601
Data analysis and interpretation

In analysing content, researchers often adopt the two levels approach: the descriptive
approach and the interpretative approach. A descriptive approach to content analysis
involves a description of the data, while an interpretative approach focuses on the
meaning of such data (Nigatu, 2009).

8.5.4.8. Thematic analysis


Thematic analysis is a commonly used form of qualitative data analysis; hence, it is
the central focus of our presentation. It involves a systematic segmentation,
categorisation, summarising and reconstruction of data, to ensure the emergence of
important concepts within the data (Lloyd, 2021). Thematic analysis is a form of
analysis seeking to provide themes and subthemes as an outcome (Braun & Clarke,
2006). It is a qualitative data analysis method seeking to identify, analyse and report
patterns or themes within the data by minimally organising and describing the data, in
detail (Braun & Clarke, 2006). Maxwell and Chmiel (2014) defines thematic analysis
as the reduction and analysis strategy through which the data are segmented,
categorised, summarised and reconstructed, in such a way that they capture important
concepts of the data set. In other words, the process of thematic analysis involves
data reduction, data categorisation and data reorganisation, with the aim of developing
concepts contained therein (Maxwell & Chmiel, 2014; Roulston, 2022). Data reduction
involves the reduction of data by getting rid of irrelevant or repetitive data material, in
order to remain with definable conceptual categories. Once, the data is reduced, then
one begins to categorise it by sorting and classifying the codes into thematic groupings
or clusters of meaning. Thematic analysis retains the connection of data to its original
context. These clusters of meanings will then be reorganised into thematic
representations, which are then supported by evidence in the form of excerpts from
the interviews (Roulston, 2022). Several researchers have published work on the
qualitative thematic data analysis process (Creswell, 1994; Clarke & Braun, 2006;
Braun & Clark, 2019). Researchers do not only hold different views regarding the
strategies and meaning of qualitative data analysis. They also hold different views
regarding the process through which data analysis should evolve. Below we provide
an outline of the process of analysis, as proposed by various authors.

Braun and Clarke’s (2006) approach to data analysis involves a six-stage process as
outlined below:

41
Learning unit 8 | RSC2601
Data analysis and interpretation

• Step 1: Familiarisation with the data: Here, researchers immerse


themselves in and become intimately familiar with their data by
repeatedly reading it (and listening to audio-recorded data at least once,
if relevant), while noting any initial analytic observations.

• Step 2: Coding: This is the generation of pithy labels for important data
features relevant to the research question guiding the analysis. Through
coding, researchers also capture both a semantic and conceptual
reading of the data. They code every data item, and collate all codes and
relevant data extracts.

• Step 3: Searching for themes: Searching for themes is a bit like coding
codes to identify similarity in the data. This ‘searching’ is an active
process; themes are constructed by the researcher by collating all the
coded data relevant to each theme.

• Step 4: Reviewing themes: This involves checking that the themes ‘work’
in relation to both the coded extracts and the full data set. The researcher
should reflect on whether the themes tell a convincing and compelling
story about the data, and begin to define each individual theme, and the
relationship between the themes. Some themes may be collapsed
together or split into two or more themes, or even be discarded
altogether, to rebegin the process of theme development.

• Step 5: Defining and naming themes: This requires the researcher to


conduct and write a detailed analysis of each theme (the researcher
should ask, ‘What story does this theme tell?’ and ‘How does this theme
fit into the overall story about the data?’), identifying the ‘essence’ of
each theme and constructing a concise, punchy and informative name
for each theme.

42
Learning unit 8 | RSC2601
Data analysis and interpretation

• Step 6: Writing up: This means telling the reader a coherent and
persuasive story about the data, by weaving together the analytic
narrative and (vivid) data in relation to existing literature.

For Erlingsson and Brysiewicz (2017), data analysis evolves through a four-staged
process as outlined below:

• Step1: Familiarising oneself with the data and hermeneutic spiral


In familiarising themselves with the data and hermeneutic spiral,
researchers read the transcripts repeatedly, while being mindful of the
research aim. At this stage, initial impressions will be written down, while
trying to gain a sense of the data by breaking down the whole text into
smaller parts. Hermeneutics (different perspectives from parts of the data)
will be compared, to determine if the overall impression of the whole data
verifies the parts thereof. Once the initial step is completed, the researcher
will then proceed to the next step, which is to divide the text into meaningful
units and condensing such units.

• Step 2: Dividing the text into meaning units and condense them
Once the researcher is familiar with the data and hermeneutic spiral, they
will then divide the text into meaningful units and begin to condense them.

• Step 3: Formulating the codes


In formulating codes, the researcher concisely describes the condensed
meanings of a unit and reflects afresh on the data. Here, the researcher
pays closer attention to the data, with limited interpretation of the content
thereof. The process also involves writing down ideas, as they come to
mind, and may be repeated several times, until the researcher is satisfied
with the codes.

• Step 4: Developing categories and themes from the codes


Once codes are created, they then need to be sorted into categories by
comparing them, appraising them and clustering similar ones by creating
categories.

43
Learning unit 8 | RSC2601
Data analysis and interpretation

Another strategy of analysis is the one developed by Nigatu (2009), which


involves five steps:

• Step1: Organising data. This step involves the transcribing, translation


and cleaning of data. It is also a stage during which the researcher labels
the data by structuring it and familiarising themselves with it.

• Step 2: Identifying the framework. During this stage, the researcher reads
data repeatedly and identifies a framework, guided by either the
explanatory or exploratory design. The identified framework, which is a
coding plan, will then structure, label and define the data.

• Step 3: Sort data into framework. During this stage of analysis, the
researcher will code the data by modifying the framework.

• Step 4: Use framework in descriptive analysis. Here, descriptive analysis


is conducted using a range of responses and by identifying recurrent
themes. For exploratory research studies, step 4 is the last step of
analysis, while for explanatory studies, the analysis process proceeds to
step 5.

In the preceding presentation, we outlined the data analysis steps from various
authors. You would have noticed that some of the stages are common in the process
proposed by all authors. These commonalities confirm what was noticed by Braun and
Clarke (2006), that the stages of qualitative data analysis are not necessarily unique,
even to thematic analysis itself. In other words, some of the stages may also be found
to be similar, for instance, to discourse analysis. Familiarisation with the data, for
instance, is one example of these commonalities which is found in both Braun and
Clark’s (2006) and Erlingsson and Brysiewicz’s (2017) approaches. Due to the
purpose and scope of this learning unit, we will only go deeper into one of the above
methods, to practically demonstrate how the data analysis is conducted. For such
purposes, we will focus on Braun and Clarke’s approach.

44
Learning unit 8 | RSC2601
Data analysis and interpretation

A step-by-step process of qualitative analysis as proposed by Braun and Clark (2006)

STEP 1: Familiarising yourself with your data

Before you begin to familiarise yourself with the data, you need to ensure that such
data is properly prepared and readable. The data that you will be analysing may either
have been given to you by the research assistants or a team of your field workers. It
may either be in a form of verbal interviews (audio recordings), or speeches or in the
form of text on documents. Data, which is not in text form, will have to be transcribed
into text (Braun & Clarke, 2006). To transcribe means to transform spoken language
into text (Marying, (2022). Reading box 9.1, below, is an example of a transcription.

Reading box 9.1: An example of a transcription

(Please note: The name of the participant is replaced with a code (P-1) in order to ensure her
anonymity, as required by the research ethical principles. The numbers used next to the
alphabetical code refer to the line on the page where the remarks were found.

B1 Researcher: Tell me about your experiences and where does that come from?
B2 Why are you caring for people?
B3 P-1: I think when you’re young and you’re trying to find yourself, many different things, and I
found, B4 I lived in Laudium, I worked there. I lived in Lotus for a little while, not very long. And I
found in the B5 areas that I lived in, the first area I was in was Laudium; I chose it because it’s
very quiet. And we B6 were a very open family, very open to discussions and stuff, and then I
found that there was stuff B7 they were doing there that really you don’t do. So, in that sense I
used to be this figure of talking
B8 and always telling friends we don’t do it that way; let’s try. And from there I moved to Lotus
and
B9 there I found a lot of shebeens and drug abuse. I found many children where the mother or
the
B10 father was not there, and I found myself lost.
B13 Researcher: Okay, tell me about your...?

If you have collected the data by yourself, you would, of course, have some kind of
prior knowledge or little familiarity with it. This, however, does not mean you can
immediately begin analysing without further familiarising yourself with it. Braun and
Clarke (2006) propose that you be immersed with the data to an extent that you are
familiar with its breadth and depth. You should do so by actively and repeatedly
reading through the data, while searching for meanings, patterns and so on.

45
Learning unit 8 | RSC2601
Data analysis and interpretation

STEP 2: Generating initial codes

Now that you have familiarised yourself with the data by repeatedly reading through it,
you will then begin to produce some codes18. As defined by Marying (2022), coding
refers to the inductive or deductive process of identifying categories in the text. Coding
is determined by whether the researcher follows a data-driven approach (inductive
approach) or a theory-driven approach (deductive approach) (Braun & Clarke, 2006).
A data-driven approach to qualitative data analysis involves developing themes purely
from data set, while the theory-driven approach involves developing themes based on
some prior set of ideas or questions around which such themes should be based. In
coding, researchers work systematically through each data set, writing notes on the
text, highlighting potential patterns for easy identification of data segments. You will
do this throughout the data set and then collate similar codes. Now having gained an
insight on the second step, complete activity 8.8 below.

18
A code is the result or a product of coding.

46
Learning unit 8 | RSC2601
Data analysis and interpretation

Activity 8.8: Coding exercise

Pay a closer attention to the picture. Looking at the four individuals, what do you
think they are doing? What do you think about the stickers on the wall? What could
be written thereon? In your view, what is the whole exercise about?

Figure 8.1: Generating initial codes

Reading box 9.2, below, serves as a demonstration of how the coding process is done.
This is an interview extracted from a transcript on the experiences that home-based
caregivers, who are caring for people living with HIV, encounter as they perform their
duties.

47
Learning unit 8 | RSC2601
Data analysis and interpretation

Reading box. 9.2: An example of the generation of initial codes

N36 Researcher: And what is it that you hate about being a caregiver.
N37 Ms N: What I hate is when we walk and as you are still knocking at a particular household you
get N38 words such as “no we don’t have a patient here” even before you greet and introduce
yourself.
N39 Researcher: How do they know that you are there for the patient?
N40 Ms N: These people talk, once we leave the households friends and neighbours would sneak
in N40 and say, “but why did you allow these people to come into your house, don’t you know that
these N41 people work with AIDS.” One of the patients’ mother had to tell us that her neighbours
asked her N42 why did she allow us in because we are working with AIDS. And what we would tell
them is that we N43 do not work only with AIDS. We work with all patients. She must also call us if
she has a patient N44 because sometimes you would find those people who has stroke left alone
in the house, so who N45 will look after that patient, no one. Once we get there we must bath
him/her, help him/her do some
N46 exercise and feed him/her.
N47 Researcher: And how does that feel when you are not welcome in the houses?
N48 Ms N: It is painful. As you leave that household you will feel discouraged although you would
N49 console yourself that you are here to work and people are not the same. So we would not
loose N50 courage, we would go to the next house because we are here to work and we are here
to help the N51 community. We don’t care about those who don’t want us, one day they will need
us. Many of them N52 used to chase us away from their houses but eventually they would come
referred to our offices N53 seeking help by those who were our patients before. When you get there
you realise that you were N54 once informed about that patient but could not assist because you
were chased away. As you leave N55 a household where you are chased away, you would feel
pain because you would be thinking about N56 that patient who is hidden, without food, not bathed
and often left alone. It is very painful because N57 we are there for such kinds of patients.

A quick reflection on reading box 9.2. From the highlighted data sets, some patterns
can be noticed. The orange highlights, for instance, reflects the words expressed by
the participant in relation to the bad treatment that they were receiving as caregivers
when they visit the households. A common pattern here is that they were not welcome.
Looking at the green pattern, you will notice some kind of coping strategy, which takes
the form of being resilient and continuing to do the work that they are called to do,
despite all the negativity they are confronted with. Moving on to the yellow highlights,
one notices the pattern that emerges is a description of their main duties as caregivers.
Regarding the red highlights, one gets a sense of the pain associated with the
knowledge of the hardships that the patients are going through. This is an example of
a typical coding process that researchers conduct as part of the analysis process. It
will be expected of you to do the same with all of your data sets, searching through

48
Learning unit 8 | RSC2601
Data analysis and interpretation

patterns, line by line, until all of your entire data sets have been examined. The codes
that may be extracted from the above exercise could be:

(a) Reception by members of the patient’s family


(b) Ignore negative experiences and proceed with work
(c) Main duties of caregivers
(d) Painful experience based on patients’ conditions

Remember, the process is not cast in stone. Different researchers can come up with
different codes, based on the same data. Now that you have familiarised yourself with
the data and developed the codes, the next step is to search for themes.

STEP 3: Searching for themes

During this stage, analysis become expanded: from a narrowly focused coding to a
broader level of themes. It involves sorting through the different codes, to identify
potential themes and collating all applicable data within each of the relevant themes.
Braun and Clarke (2006) recommend the use of visual presentations, such as tables,
mind-maps or piling and organising pieces of paper into theme piles (just like you saw
in figure 8.2 above). During this process of sorting the codes into themes, it is possible
that some of the codes may be retained as themes, while others may become
subthemes.

49
Learning unit 8 | RSC2601
Data analysis and interpretation

Reading box. 9.3: An example of the process of searching for themes

As we continue with the extract from reading box 9.2, one might, based on the list of four codes that
were created, begin to search for a theme. From the two codes: ‘Reception by members of the
patient’s family’ and ‘Painful experience based on patients’ conditions’, the researcher may
decide to cluster them together under one theme, which is ‘the negative experiences of caregivers
when interacting with members of the patients’ families. Under this theme, the two subthemes
could be ‘negative reception by family members of the patients’ and ‘the impact of patients’
conditions on caregivers’ emotional state’.

The same exercise may be followed with the remaining two codes: ‘coping strategies’ and ‘main
duties of caregivers’, which may be elevated to the state of being themes by themselves. Upon
searching through your data set, you might come up with the following themes and subthemes:

Theme 1: The negative experiences of caregivers when interacting with members of the
patients

. Subtheme 1.1: Negative reception by family members of the patients


when we walk and as you are still knocking at a particular household you get words such as, “no we
don’t have a patient here”, even before you greet and introduce yourself….

Once we leave the households friends and neighbours would sneak in and say, “but why did you
allow these people to come into your house, don’t you know that these people work with AIDS.” One
of the patients’ mother had to tell us that her neighbours asked her why she allowed us in because
we are working with AIDS.

2. Subtheme 1.2: The impact of patients’ conditions on caregivers’ emotional state


you would feel pain because you would be thinking about that patient who is hidden, without food,
not bathed and often left alone. It is very painful because we are there for such kinds of patients.

Theme 2: The main duties performed by caregivers

Once we get there, we must bath him/her, help him/her do some exercise and feed him/her.

Remember, this is just an example. In a real situation, your themes and subthemes
will be supported by voluminous extracts from the interviews. Also be reminded that
themes that are developed at this stage are not conclusive, they are what Braun and
Clarke (2006:20) call “candidate themes” until properly reviewed, which is the purpose
of the next stage.

STEP 4: Reviewing the themes

50
Learning unit 8 | RSC2601
Data analysis and interpretation

As the researcher engages themselves in the review process, they might find that what
was considered to be candidate themes are actually not themes and should therefore
be collapsed into each other or broken down into separate themes (Braun & Clarke,
2006). The process of reviewing themes happens on two levels. Firstly, it involves
reading all the collated extracts for each theme, to ensure coherence. Where
coherence exist, you then move to the next theme and, where it is not, the researcher
will have to decide whether to rework a problematic theme or relocating the extracts
to another theme (Braun & Clarke, 2006). Secondly, the review process involves
verifying whether or not the individual themes are an accurate reflection of the
meanings embedded within the data set, with the accuracy thereof dependent on the
analytical approach adopted by the researcher (Braun & Clarke, 2006).

Reading box. 9.4: An example of how researcher review’s themes

As indicated earlier, the process of reviewing themes takes two forms. In the first instance,
researchers spend time reading through the collated extracts under each theme, to ensure
that they are coherent. The main focus is on the extracts (storylines).

3. Theme 1: Family members’ reaction to the caregivers

3.1. Subtheme 1.1: Negative reception by family members


when we walk and as you are still knocking at a particular household you get words such as, “no we
don’t have a patient here”, even before you greet and introduce yourself….

Once we leave the households friends and neighbours would sneak in and say, “but why did you
allow these people to come into your house, don’t you know that these people work with AIDS.” One
of the patients’ mother had to tell us that her neighbours asked her why she allowed us in because
we are working with AIDS.

3.2. Subtheme 1.2: Emotional impact of patients’ conditions on caregivers


you would feel pain because you would be thinking about that patient who is hidden, without food,
not bathed and often left alone. It is very painful because we are there for such kinds of patients.

4. Theme 2: The core duties of a caregiver


And what we would tell them is that we do not work only with AIDS. We work with all patients. She
must also call us if she has a patient

“you would console yourself that you are here to work, and people are not the same. So, we would
not loose courage, we would go to the next house because we are here to work and we are here to
help the community. We don’t care about those who don’t want us, one day they will need us”.

Secondly, you will need to change your focus and direct it to the theme themselves, to see
whether they reflect the meaning embedded within the data sets and, if not, you will have to
correct them accordingly.

STEP 5: Defining and naming themes

51
Learning unit 8 | RSC2601
Data analysis and interpretation

Once the researcher is satisfied about the themes developed in step 3, and reviewed
in step 4, they will proceed with the fifth step, which is to define and name the themes.
In defining and naming the themes, researchers identify the essence of their accurate
meaning (Braun & Clarke, 2006). At this stage, the researcher will revisit the data
extract of each theme and reorganise them coherently and consistently, as
accompanied by narrative accounts of the participants and, where necessary, rename
the themes and subthemes, making sure it is concise and punchy (Braun & Clarke,
2006).

Reading box. 9.5: An example of how researcher could rename the themes

In renaming the theme, researchers are guided by the essence of the meanings embedded in
the data extracts as well as the overall coherence of other themes. Following the example
provided in step 3, one might rename the themes as follows:

1. Theme 1: Treatment received by caregivers

1.1. Subtheme 1.1: Negative treatment from families


when we walk and as you are still knocking at a particular household you get words such as, “no we
don’t have a patient here”, even before you greet and introduce yourself….

Once we leave the households friends and neighbours would sneak in and say, “but why did you
allow these people to come into your house, don’t you know that these people work with AIDS.” One
of the patients’ mother had to tell us that her neighbours asked her why she allowed us in because
we are working with AIDS.

1.2. Subtheme 1.2: Caregivers’ concerns about the patients’ conditions


you would feel pain because you would be thinking about that patient who is hidden, without food,
not bathed and often left alone. It is very painful because we are there for such kinds of patients.

2. Theme 2: The nature of caregiving


And what we would tell them is that we do not work only with AIDS. We work with all patients. She
must also call us if she has a patient

“you would console yourself that you are here to work, and people are not the same. So, we would
not loose courage, we would go to the next house because we are here to work and we are here to
help the community. We don’t care about those who don’t want us, one day they will need us”.

Remember, there are no fussy rules. What should guide researchers is the overall extracts,
as read in the context of other themes. You might come up with different themes, as long as
they accurately reflect the essence of your storylines or extracts.

Once the themes are accurately named and defined, the researcher will then move to
the last step, which is the reporting stage.

STEP 5: Producing the report

52
Learning unit 8 | RSC2601
Data analysis and interpretation

Based on the final themes as supported by the extracts, the researcher will now begin
to tell a story of the data, coherently, logically and concisely, and in a non-repetitive
manner. The themes should be supported by enough data extracts or storylines, which
are, in a way, evidence to demonstrate the prevalence of a particular theme (Braun &
Clarke, 2006). Although Braun and Clarke do not recommend a format for a data
analysis report, they suggest that such a report should be a scholarly one. In other
words, your data should be presented logically, in a coherent fashion, in the context
of literature, including the adopted theories. It is in the report where your interpretation
of the data will happen. In terms of the structure, a research report on data
presentation will generally have an introduction, introducing the report, which will then
be followed by a description of the biographical profiles of the research participants.
After the biographical profiles, the themes and subthemes will be presented. An
example of the way a report can be presented is provided in reading box 9.6 below:

Reading box 9.6: Reporting

In reporting, the following example may serve as a guideline:

1. Introduction

Through this study, the researcher sought to understand the experiences and challenges
faced by caregivers when rendering services to people living with HIV, in the province of
Gauteng, South Africa. As part of the study, participants were expected to answer five
questions, posed to the participant in a semi-structured interview, around their experiences
as caregivers. The findings of the study are presented in this section of the report, in the
form of biographical profiles of the participants as well as the themes and subthemes that
emerged from the process of data analysis.

Biographical profiles 19 of the participants

A total of fifteen caregivers were identified and recruited to take part in this study. Of the
fifteen, six were males and nine were females. Their ages varied between 24 and 55 years
old. Three of them were 55 years old, three were 30, 31 and 44, while two were 24 and 50,
respectively. Four of them were 34, 36, 37 and 39, respectively, while three were 40, 45
and 33, respectively. Looking at their ages, one gets a sense of an intergenerational mix.

Each of the features (age, gender, race, etc.), should be discussed in the context of
existing literature. You must clearly explain what other studies found regarding such
features and how different or similar are they with what your study has found, and
immediately draw your own conclusion regarding that.

Note: The biographical profiles of the participants can include various features,
depending on the purpose of the study. One might include things like the sources of
income, family composition, educational credentials, work experience and others.

19
The biographical profile of the participants (i.e., their ages, socioeconomic conditions, educational
qualifications) is one of the crucial ways in which the context is enhanced.

53
Learning unit 8 | RSC2601
Data analysis and interpretation

2. Themes and subthemes that emerged from data analysis

Regarding the findings on thematic analysis, two main themes and four subthemes
emerged. These themes and subthemes are introduced and explained below.

2.1. Theme 1: Treatment received by caregivers

2.1.1. Subtheme 1.1: Negative treatment from families


when we walk and as you are still knocking at a particular household you get N38 words
such as, “no we don’t have a patient here”, even before you greet and introduce yourself….

Once we leave the households friends and neighbours would sneak in and say, “but why
did you allow these people to come into your house, don’t you know that these people work
with AIDS.” One of the patients’ mother had to tell us that her neighbours asked her why did
she allow us in because we are working with AIDS.

2.1.2. Subtheme 1.2: Caregivers’ concerns about the patients’ conditions


you would feel pain because you would be thinking about that patient who is hidden, without
food, not bathed and often left alone. It is very painful because we are there for such kinds
of patients.

2.2. Theme 2: The nature of caregiving


And what we would tell them is that we do not work only with AIDS. We work with all patients.
She must also call us if she has a patient

As you did with the biographical profiles of the participants, you need to explain the
findings under each theme, using existing literature as well as your adopted
theoretical framework. You must explain the findings in the context of such literature
and the theory/theories.

8.6. CONCLUSION

In this learning unit, we have introduced you to several techniques for analysing and
interpreting quantitative and qualitative data. From the quantitative research approach,
the key methods discussed in this unit included descriptive statistics and their purpose
in data analysis. We also discussed frequency distribution tables and graphs, as
methods of summarising and organising quantitative data. The measures of central
tendency were discussed in detail, to help you understand which applicable measure
of central tendency may be useful for your data – by now you should be able to
calculate the measures of central tendencies using the formulas provided. We have
also discussed the measures of variability and their purpose in data analysis. The unit
also covered a brief, but detailed discussion on correlations and how graphical
representations from correlational analysis can be interpreted. From a qualitative
research perspective, we introduced the meaning of qualitative data analysis and the

54
Learning unit 8 | RSC2601
Data analysis and interpretation

purpose thereof. We also introduced, defined, and explained the various strategies or
methods that can be used to assist you in analysing various forms of qualitative data
and provided relevant practical scenarios to further enhance your knowledge. To
assist you in measuring your level of understanding, we provided a self-evaluation
exercise. Please work on the questions with the necessary dedication and engage us
for any questions or clarity.

55
Learning unit 8 | RSC2601
Data analysis and interpretation

8.7. SELF-EVALUATION ASSESSMENT

This section aims to test your level of understanding of the content presented in this
learning unit.

• Are you able to define data analysis in your own words?

• Are you able to differentiate between quantitative and qualitative data analysis?

• Are you able to define descriptive statistics, in your own words?

• Are you able to provide a definition of frequency distribution?

• Are you able to indicate which data can be suitably used for graphs and tables?

• Are you able to differentiate between a histogram and frequency polygon?

• Are you able to differentiate between the measures of central tendency and
describe their purpose in data analysis?

• Are you able to define the measures of variability and describe their purpose in
data analysis?

• Are you able to define the concept of correlation and describe the context in
which you can use correlations to analyse your data?

• Are you able to define a scatter plot?

• Are you able to define the term qualitative data analysis?

• Are you able to discuss qualitative data analysis?

• Are you able to explain the purpose of qualitative data analysis?

• Are you able to explain qualitative data analysis and interpretation?

• Are you able to identify and define various methods of qualitative data analysis?

• Are you able to compare various methods of qualitative data analysis?

• Are you able to discuss thematic analysis?

• Are you able to explain how thematic analysis is implemented, following a step-
by-step process?

56
Learning unit 8 | RSC2601
Data analysis and interpretation

8.8. ADDITIONAL LEARNING EXPERIENCES

This section aims to enhance your learning experience on some of the learning
outcomes addressed in this learning unit. Please use the links below to watch
YouTube videos after reading the learning unit and answering the self-evaluation
assessment questions.

YouTube links

https://youtu.be/YnjR9WTKHEc - Measures of central tendency

https://youtu.be/zsnLm87AJVU - Measures of central tendency

https://youtu.be/3CPmjC6qBeg - Measures of variability

https://youtu.be/fO7NF3sss34 - Measure of variability (range)

How to Analyze Qualitative Data - YouTube (Qualitative data analysis)

Overview - ATLAS ti 22 Windows - Bing video. (Electronic data analysis - Atlas ti 6.0)

Narrative Analysis In Qualitative Research: Simple Explainer (With Examples) -


YouTube (Narrative Analysis)

Constant Comparative Analysis in Qualitative Research - YouTube. (Constant


comparative analysis)

Interpretative Phenomenological Analysis - Mostackas, Coalizi, and Smith - YouTube


(Phenomenological analysis)

Conversation Analysis - YouTube (Conversational analysis)

Qualitative Content Analysis 101: The What, Why & How (With Examples) - YouTube
(Content analysis)

Discourse Analysis - YouTube (Discourse analysis)

Thematic Analysis of Qualitative User Research Data - YouTube (Thematic analysis).

8.9. OPEN EDUCATIONAL RESOURCE

Jhangiani, R.S., Chiang, I. A., Cuttler, C., & Leighton, D.C. 2019. Research methods
in psychology. Kwantlen Polytechnic University.
https://kpu.pressbooks.pub/psychmethods4e/

57
Learning unit 8 | RSC2601
Data analysis and interpretation

Flick, U. 2014. The SAGE Handbook of qualitative data analysis. London: Sage.
https://methods.sagepub.com/book/the-sage-handbook-of-qualitaive-data-analysis.

58
Learning unit 8 | RSC2601
Data analysis and interpretation

8.10. REFERENCES

Bhome, S., Chandwani, V., Iyer, S., Prabhudesai, A., Jha, N., Desai, S., & Koshti, S.D.
2013. Research methodology. Himalaya publishing house.

Erlingsson, C. & Byrsiewicz, P. 2017. A hands-on guide to doing content analysis.


African Journal of Emergency Medicine, 7(1):93-99.

Fram, S.W. 2013. The constant comparative analysis method outside of grounded
theory. The Qualitative Report, 18(1): 1-25.
HSRC.2010. South African social attitudes. Human Sciences Research Council.

Ibrahim, M. 2015. The art of analysis. Journal of Allied Health Sciences Pakistan,
1(1):98-104.

Igatu, T. 2009. Qualitative data analysis.

http://www.uop.edu.pk/ocontents/Lecture%201%20B%20Qualitative%20Research.p
df. (Accessed on 5 April 2023).

Isabirye, A.K., & Makoe, M. 2018. Phenomenological analysis of the lived experiences
if academics who participated in the professional development programme at an open
distance learning (ODL) university in South Africa. Indo-Pacific Journal of
Phenomenology, 18(1):1-11.

Lee, J. 2019. Statistics, Descriptive. In: A. Kobayashi (Ed.), International Encyclopedia


of Human Geography Kent, OH: Elsevier. (pp. 13–20) https://doi.org/10.1016/B978-0-
08-102295-5.10428-7

Jhangiani, R.S., Chiang, I. A., Cuttler, C., & Leighton, D.C. 2019. Research methods
in psychology. Canada: Kwantlen Polytechnic University.
https://kpu.pressbooks.pub/psychmethods4e/

Keestra, M., Uilhoorn, A., & Zandveld, J. 2022. An introduction to interdisciplinary


research. 2nd edition. Armsterdam: Armsterdam University.

Knoblauch, H., Tuma, R., & Schnettler, B. 2014. Video analysis and videography. In:
U.

Flick (ed). The Sage handbook of qualitative data analysis. London: Sage. (pp. 21-34)

Lloyd, A. 2021. The qualitative landscape of transformation literacy research:


perspectives, methods and techniques. London: Facet.

Manikandan, S. 2011. Frequency distribution. Journal of Pharmacology &


Pharmacotherapeutics, 2(1), 54. https://doi.org/10.4103/0976-500X.77120

Marying, P. 2022. Qualitative content analysis: a step-by-step guide. London: Sage.

59
Learning unit 8 | RSC2601
Data analysis and interpretation

Maxwell, J.A. & Chmiel, M. 2014. Notes toward a theory of qualitative data analysis.
In U. Flick (ed), The Sage handbook of qualitative data analysis. London: Sage. (pp.
21-34)

Mezmir, E.A. 2020. Qualitative data analysis: an overview of data reduction, data
display and interpretation. Research on Humanities and Social Sciences, 10(21): 15-
27

Morse, J. 2020. The changing case of qualitative inquiry. International Journal of


Qualitative Methods, 19(1):1-7. doi: 10.1177/1609406920909938

Moule, P. 2021. Making sense of research in nursing health and social care. London:
Sage.

Ngulube, P. 2015. Qualitative data analysis and interpretation: systematic search for
meaning. In E.R. Mathipa & M.T. Gumbo (eds), Addressing research challenges:
Making headway for developing researchers. Mosala-MASEDI Publishers &
Booksellers cc: Noordwyk, pp. 131-156.

Ntinda, K. 2018. Narrative therapy.

file:///C:/Users/lekgamr/Downloads/NtindaK2018NarrativeResearchInLiamputtongPe
dsHandbookofResearchMethodsinHealthSocialSciences.SpringerSingapore.pdf.
(Accessed on 05 April 2023).
Peck, R., Olsen, C., & Devore, J. L. 2015. Introduction to statistics and data analysis.
Cengage Learning.

Rae, I. n.d. Ethnographic data analysis. https://pages.cs.wisc.edu/~irene/l17-


ethnography_analysis.pdf. (Accessed on 05 April 2023).

Roulston, K. 2022. Interviewing: a guide to theory and practice. London: Sage.

Yegidis, B.L., Weinbach, B.W. & Myers, L.L. 2018. Research methods for social
workers. 8th Edition. New York: Pearson.

60

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy