0% found this document useful (0 votes)
36 views34 pages

Hamisi

This research proposal aims to develop a multiple linear regression model to analyze the determinants of youth unemployment in Kenya, utilizing data from the Kenya National Bureau of Statistics. The study will identify key factors such as educational level, gender, and economic growth that influence unemployment rates among youths, and it intends to provide insights for policymakers to create targeted interventions. The research is significant for enhancing employability and informing labor market policies, ultimately contributing to economic growth and job creation in Kenya.

Uploaded by

ellyotieno856
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views34 pages

Hamisi

This research proposal aims to develop a multiple linear regression model to analyze the determinants of youth unemployment in Kenya, utilizing data from the Kenya National Bureau of Statistics. The study will identify key factors such as educational level, gender, and economic growth that influence unemployment rates among youths, and it intends to provide insights for policymakers to create targeted interventions. The research is significant for enhancing employability and informing labor market policies, ultimately contributing to economic growth and job creation in Kenya.

Uploaded by

ellyotieno856
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 34

FACULTY OF APPLIED AND HEALTH SCIENCES

DETERMINANTS OF UNEMPLOYMENT IN KENYA: MULTIPLE LINEAR


REGRESSION APPROACH

PRESENTED BY:

1. ALLAN MATECHE BSSC/383J/2019


2. TOBIAS OTIENO BSSC/388J/2019
3. OMBUI HAMISI BSSC/406J/2019
4. KIBET NG’NO EVANS BSSC/366J/2019
5. GIDEON KIPROTICH BSSC/365J/2019

A RESEARCH PROPOSAL SUBMITTED TO THE DEPARTMENT OF


MATHEMATICS AND PHYSICS IN PARTIAL FULFILLMENT FOR THE AWARD
OF BACHELOR OF SCIENCE IN STATISTICS AND COMPUTER SCIENCE

©AUGUST, 2023

i
DECLARATION

STUDENTS’ DECLARATION

This is our original work and has never been presented to any other academic institution for an
award of the degree in any other university

1. TOBIAS OTIENO BSSC/388J/2019

DATE ……………………………………SIGN……………………………………

2. ALLAN MATECHE BSSC/383J/2019

DATE ……………………………………SIGN……………………………………

3. OMBUI HAMISI JOHN BSSC/406J/2019

DATE ……………………………………SIGN……………………………………

4. KIBET NGENO EVANS BSSC/366J/2019

DATE ……………………………………SIGN……………………………………

5. GIDEON KIPROTICH BSSC/365J/2019

DATE ……………………………………SIGN……………………………………

SUPERVISOR’S DECLARATION

This research proposal has been submitted with my approval as the supervisor.

Dr. CYRILUS WANDERA

DATE ……………………………………SIGN……………………………………

ii
DEDICATION

This proposal is dedicated to friends, our families and lecturers who have tirelessly
contributed their time, expertise and support to developing this proposal. Their unwavering
dedication and commitment have been instrumental in shaping this proposal and bringing it to
fruition. We extend our heartfelt appreciation to our team members, collaborators, mentors
and all stakeholders involved, for their invaluable contributions and unwavering belief in the
vision of this proposal. Their dedication inspires us to strive for excellence and create
meaningful impact.

iii
ACKNOWLEDGEMENT

Words cannot express our gratitude to Almighty God for helping us to the success of this
proposal. Thanks, should also go to all those who have contributed to the preparation and
development of this proposal, including our able supervisor Dr. Cyrilus Wandera with your
invaluable support and expertise, this document came to fruition. We would also like to
acknowledge the valuable contributions of our group members for their tireless efforts,
creativity and collaborative spirit that have significantly enriched this proposal.

Lastly, many thanks to our family and friends for their constant support, compassion, and
inspiration during this process. We appreciate your confidence in our talents and your
consistent support, which have provided inspiration and motivation for everyone who has
contributed to this project, whether directly or indirectly. Your group's combined efforts have
turned a concept into a convincing proposition. We sincerely appreciate your commitment,
knowledge, and continuous support.

iv
ABSTRACT

Kenya is not an exception to the significant issue of youth unemployment like many other
developing countries where policymakers and stakeholders must comprehend the
determinants of youth unemployment to develop successful strategies and interventions. The
main objectives of this study are to develop a multiple linear regression (MLR) model for the
determinants of youth unemployment in Kenya, determine the asymptotic properties of the
multiple linear regression model, and use the developed model to predict unemployment rate
among youths. The study will utilize a secondary dataset collected from a reliable source
Kenya National Bureau of Statistics (KNBS). The dataset will include select regressors such
as educational level, gender, population growth rate, and economic growth. The study will
use a multiple linear regression model to determine the major variables affecting
unemployment among youths in Kenya. Statistical analysis software such as Statistical
Package for Social Sciences (SPSS) will be used to analyze the data after cleaning. Both
descriptive and inferential statistics will be employed in this study. This study will use the
MLR model to test the determinants of unemployment at a 5% confidence interval. The study
is envisaged to have a positive direct implication in the community, enabling individuals to
enhance their employability by identifying demand skills and qualifications. For academia,
this research will enrich economics and social sciences knowledge, inspiring further
investigations and fostering interest in related fields by creating employment
opportunities. The Kenyan government can utilize the insights to address unemployment
through targeted interventions, effective labor market policies, training programs and
evidence-based initiatives, thus promoting economic growth and job creation.

v
TABLE OF CONTENTS

DECLARATION........................................................................................................................................ii
DEDICATION..........................................................................................................................................iii
ACKNOWLEDGEMENT...........................................................................................................................iv
ABSTRACT...............................................................................................................................................v
LIST OF ABRIVIATIONS.........................................................................................................................viii
CHAPTER ONE........................................................................................................................................1
1.0 INTRODUCTION............................................................................................................................1
1.1 BACKGROUND INFORMATION.....................................................................................................1
1.2 STATEMENT OF THE PROBLEM....................................................................................................3
1.3 OBJECTIVES..................................................................................................................................3
1.4 HYPOTHESIS.................................................................................................................................3
1.5 LIMITATIONS................................................................................................................................3
1.6 SIGNIFICANCE...............................................................................................................................4
CHAPTER TWO.......................................................................................................................................5
2.0 LITERATURE REVIEW....................................................................................................................5
2.1 INTRODUCTION............................................................................................................................5
2.2 Multiple linear regression model.................................................................................................5
2.3 Asymptotic Properties of the Multiple linear regression model..................................................7
2.4 Prediction of the rate of unemployment......................................................................................8
2.5 SUMMARY OF RESEARCH GAP.....................................................................................................9
CHAPTER THREE...................................................................................................................................10
3.0 RESEARCH METHODOLOGY........................................................................................................10
3.1 Introduction...............................................................................................................................10
3.2 Study Area..................................................................................................................................10
3.3 Research Design.........................................................................................................................10
3.4Data Collection............................................................................................................................10
3.5 Sampling Size.............................................................................................................................11
3.6 Multiple Linear Regression Model..............................................................................................11
3.6.1 Assumptions of the MLR Model..........................................................................................12
3.6.2 Parameter Estimation..........................................................................................................12
3.7 Asymptotic Properties of the MLR Model..................................................................................13
3.7.1 Efficiency.............................................................................................................................13
3.7.2 Consistency.........................................................................................................................14
3.7.3 Normality............................................................................................................................14

vi
3.8 Data Analysis Technique............................................................................................................15
3.9 Ethical consideration..................................................................................................................15
APPENDICES.........................................................................................................................................16
Appendix 1: Time Schedule..............................................................................................................16
Appendix 2: Proposed Budget..........................................................................................................17
REFERENCES.........................................................................................................................................18

vii
LIST OF ABRIVIATIONS

ARDL................................Autoregressive Distributed Lag

ARIMA…..........................Autoregressive Integrated Moving Average

CLT...................................Central Limit Theorem

CRLB................................Crama-Rao Lower Bound

FDI....................................Foreign Direct Investment

GDP..................................Gross Domestic Product

ILO....................................International Labor Organization

KNBS................................Kenya National Bureau of Statistics

Ln…..................................Logarithm

MLE...................................Maximum Likelihood Estimator

MLR....................................Multiple Linear Regression Model

OLS....................................Ordinary Least Squares

PPP....................................Purchasing Power Parties

RE.....................................Relative Efficiency

SPSS.................................Statistical Package of Social Sciences

VEC..................................Vector Error Correction

viii
CHAPTER ONE

1.0 INTRODUCTION

Kahn (2022) in the first pages of his book describes unemployment as a state in which people
are willing and able to work at a prevalent wage rate but they are unable to get jobs.
Unemployment remains at a high level and continues to dominate news headlines in both
international and national development talks. A lot of discussions revolve around the topic,
specifically among the literate youth with career prospects. This research seeks to model the
determinants of youth unemployment and predict the future of unemployment in Kenya. This
chapter covers background information relating to youth unemployment and empirically
established determinants of implications of youth unemployment as described by Monari., et
al. (2020) in their journal. It also covers the problem statement which informed the decision
to carryout the research, literature review, theoretical framework, research objectives,
research questions, research methodology, and justification of this research to various
players.

1.1 BACKGROUND INFORMATION

In a study by Njiru, G. (2020) youth unemployment is a problem that is experienced


practically everywhere in the world. The constitution of Kenya defines youth as all
individuals in the republic between the ages of 18-35 years. According to Annor., et al (2022)
2019 Population and Census figures, 75% of Kenya's 47.6 million people are under the age of
35 years, compared to 11, 809,518 (28.7%) in the 2009 census, the youth population has
increased to 13,777,600 (29.0%) in 2019 census.

The unemployment rate expressed as a percentage is; the total number of unemployed people
divided by the entire labor force in a state or region. The youth in Kenya are approximately
three times more likely to be jobless than older people. This has been illustrated by the fact
that the youthful population is increasing in developing countries but the employment rate is
slow. A report by the International Labor Organization (ILO) in 2010, pointed out that in
developing economies where 90% of young people live, youth are more vulnerable to
unemployment and poverty. The ILO estimates that the 74 million young women and men
who are unemployed throughout the world represent roughly two-fifths of all unemployed
persons globally. A conference held by ILO in 2012, states that the world is facing a
worsening youth employment crisis with young people three times more likely to be

1
unemployed than adults and almost 73

2
million youth worldwide looking for work. According to Katindi, Sivi, and Njonjo (2019),
the majority of Kenya's young people are unemployed, underemployed, or underpaid.
Kenya's Vision 2030 envisages changing the state to become a newly industrializing middle-
income one by providing high-quality life to all its citizens by the year 2030. In 2003, the
government formulated a 5-years development strategy geared towards achieving democracy
and empowerment in addressing unemployment. Even so, Kenya has not been in a position to
solve the puzzle of unemployment that her economy has been facing.

The Labor and employment plan was formulated to focus on employment promotionand
optimal utilization of human resources and social protection as stated in Ouko., et al (2022)
paper. All these are efforts by the Government of Kenya to fight unemployment. Despite all
these efforts, however,unemployment especially among youth, continues to persist.
According to Gachari and Korir (2020), unemployment among the youth has serious
repercussions on self-esteem, poverty eradication efforts, social stability and equity. The
Government of Kenya has no definite policy regarding youth unemployment in Kenya.
According to Kiiru and Barasa's (2020) book culture, youth, and sports ministrydeals directly
with issues challenging the youth, while the public service, labor ministry and human
resource development deal with issues concerning vocational training. This research
therefore, focuses on predicting the rate of unemployment among youths in years to come.

Developing an MLR model to analyze the determinants of youth unemployment can provide
valuable insights into the underlying factors contributing to this problem. Chicco., et al.
(2021) define MLR as a statistical technique used to examine the relationship between a
dependent variable and multiple independent variables. Once the dataset is assembled,
statistical software will be employed to perform the MLR analysis. The analysis will enable
the identification of significant determinants and their respective impact on the prediction of
unemployment youth rates. Horel and Giesecke (2020) describe the significance of studying
asymptotic properties in this model of unemployment among youths in Kenya to analyze the
behavior of the model's estimators as the sample size grows large, which will enable us to
make statistical inferences about the relationship between relevant variables and assess the
precision of the estimates. Therefore, this model will be used to predict the rate of
unemployment among youths in Kenya by using historical data on relevant regressors such as
education level, population growth, and gender to estimate the relationship between them
(independent variables) and the unemployment rate (dependent variable), enabling future
predictions based on new data.

3
1.2 STATEMENT OF THE PROBLEM

Many studies have been done about modeling the unemployment rate in this country.
These materials are based mostly on coming up with the model, analysis and trends about
the regressors. Otherwise, studies have not dealt with everything and given insights into
how the regressors positively or negatively influence unemployment among youths by
giving the statistical significance of each variable. Thus, this proposal will ascertain the
association between unemployment among the youth and regressors. The research has to test
whether or not the youths are equipped with the required skills and expertise in the education
level and gender, which is why they take a long period before getting absorbed in the
employment sector. Lastly, this study will also test whether the regressors are statistically
losing their essence when it comes to employability or not.

1.3 OBJECTIVES

This study will be guided by the following objectives

1. To develop a multiple linear regression model for determinants of youth in Kenya.


2. Determine asymptotic properties of the multiple linear regression model.
3. Use the model to predict the rate of unemployment among youths.

1.4 HYPOTHESIS

𝐻0 = There is a relationship between unemployment in Kenya and regressors.

𝐻1 = There is no relationship between unemployment in Kenya and regressors

1.5 LIMITATIONS

The model will help policymakers, researchers, and stakeholders understand which factors
play a substantial role in shaping youth unemployment by quantifying the relationships
between the independent variables and youth unemployment.

4
1.6 SIGNIFICANCE

Understanding the determinants of unemployment can directly impact community members


by providing insights into the factors affecting their employment prospects. It can help job
seekers identify areas of improvement and develop skills or qualifications that are in demand
in the Labor market. Additionally, the community can gain a better understanding of the
macroeconomic factors that influence unemployment rates, which can inform their decisions
such as career choices and entrepreneurial ventures.

For the academic community, this type of research enriches knowledge and contributes to the
field of economics and social sciences. Academic researchers can use the findings as a
foundation for further investigations, leading to more comprehensive theories and models
related to unemployment. Furthermore, research of this nature can inspire students to pursue
studies and careers in economics, policy analysis and related disciplines.

The government of Kenya can benefit significantly from the insights provided by the MLR
model of unemployment determinants. Such research can help policymakers identify the root
causes of unemployment and design targeted interventions to address the issue. It can inform
the creation of effective Labor market policies, training programs, and educational initiatives
to reduce unemployment rates. By understanding the relationship between various variables
and unemployment, the government can implement evidence-based policies that foster
economic growth and job creation.

5
CHAPTER TWO

2.0 LITERATURE REVIEW

2.1 INTRODUCTION

This chapter will delve into the existing body of knowledge and research relevant to our
project proposal. The literature review is a crucial section that will allow the study to
understand the current state of knowledge about youth unemployment in Kenya and the
determinants influencing this issue. By reviewing existing studies, reports and academic
works, the study aims to build upon the existing knowledge and identify gaps that our study
can address.

2.2 MULTIPLE LINEAR REGRESSION MODEL

A study by Sam, S. O (2016), analyzes the economic determinants of youth unemployment in


Kenya using macroeconomic data from 1979 to 2012 by investigating the empirical
relationship among youth unemployment, Gross Domestic Product(GDP), population,
Foreign Direct Investment(FDI) and external debt. The study used the times series
Autoregressive Distributed Lag (RDL) model to test the long-run effects of economic
determinants of youth unemployment. At a 5% significance level, the empirical study results
indicated a unit increase in population by 1.1%; a unit increase in FDI reduces youth
unemployment by 0.00024%; a unit increase in the previous youth unemployment rate
reduces the current unemployment rate by 0.12%. Contrarily, 1% gross domestic product
increased the youth unemployment rate by 0.00559%. The study revealed that population
growth, FDI, GDP and external debt have a long-run relationship with the youth
unemployment rate. However, the study did not reveal insights into the effect of each variable
on youth unemployment, GDP, population, FDI and external debt. Therefore, this study will
reveal insights into the effects of each variable on youth unemployment.

A hybrid model for unemployment impact on social life according to a study by Popirlan., et
al. (2021), examined how unemployment impacts social life and by using datasets from six
European countries, the study analyzed the effect of unemployment on two of the main
aspects of social life: social exclusion and life satisfaction. The study predicted
unemployment rates using Autoregressive Integrated Moving Average (ARIMA) model and
the results were further used in a linear regression model alongside social exclusion and life
satisfaction data, thus obtaining the hybrid model. With the help of the point prediction
6
method, the research used the

7
hybrid model to predict new values for the two aspects of social life for the upcoming three
years and analyzed the results obtained to better understand their interconnection. The
research suggested that unemployment has particularly adverse effects on the subjective
perception of life satisfaction. Furthermore, increasing the social exclusion percentage. The
study focused more on coming up with a prediction model. In addition, this study will
therefore delve into presenting the effects and relationships of each aspect concerning
unemployment in Kenya.

Research by Dumicic., et al (2015) on the recent impacts of selected development indicators


on the unemployment rate, investigated the relationship between the unemployment rate and
development indicators that is the GDP per capita in Purchasing Power Parities(PPP) and the
Internet penetration rate, defined as the percentage of Internet users per 100 people. Two
simple linear regression models based on natural logarithms of data and the Ordinary Least
Squares (OLS) estimator appeared to be useful. The data from 34 countries were analyzed
using multivariate analysis. MLR modeling, firstly, with the original data and afterward, with
their logarithms was developed to discover if the unemployment rate was influenced by the
GDP per capita in PPP in current international and by the Internet penetration rate. Two
simple linear regression models were developed using logarithmically transformed data for
all variables, which appeared to give statistically significant models. The research proved the
simple LRM showed a negative correlation with the regressor lnX (GDP) and the main
variable under study lnY (internet penetration). The study by Dumicic., et al (2015) focused
on finding the best model instead of the findings and insights impacts of each regressor.
However, this study will delve into analyzing each variable and its significant effects on
youth unemployment.

A study by Monari., et al. (2020) on modeling economic determinants of youth


unemployment case study of Kenya used a multiple linear regression model to test the
economic determinants of youth unemployment at a 5% significance level to analyze the
economic determinants of youth unemployment in Kenya from 2000 to 2017 by investigating
the empirical relationship among youth unemployment, GDP, population, and FDI. This
study used descriptive and inferential statistics to analyze the data. The study revealed that a
unit change in GDP while holding the other factors constant would lead to an increase in
youth unemployment by a factor of 0.027, and a unit change in FDI while holding the other
factors constant would lead to an increase in youth unemployment by a factor of 0.034. On
the other hand, a unit change in population growth will lead to a change in youth
unemployment by a factor of -1.543. The study recommended that both county and national
8
governments should consider policies that encourage FDI, increasing GDP through value
addition and using external debt resources for

9
investment. However, the study did not predict if the determinants have any impacts in the
future on the youth unemployment rate. The current study will therefore delve into analyzing
each variable and its significant effect on youth unemployment.

2.3 ASYMPTOTIC PROPERTIES OF THE MULTIPLE LINEAR REGRESSION


MODEL

Muriithi and Kimani (2022), used a multiple linear regression model to analyze the
determinants of youth unemployment in Kenya, using data from the 2019 Kenya National
Bureau of Statistics (KNBS) Youth Labor Force Survey. The study found that the significant
determinants of youth unemployment in Kenya were educational attainment, gender, region
and marital status. The study found that the unemployment rate of youth with no education
was more than twice the unemployment rate of youth with post-secondary education. The
study also found that the unemployment rate of females was higher than the unemployment
rate of males. The study found that the unemployment rate was higher in rural areas than in
urban areas. The study also found that the unemployment rate was higher for youth who were
not married. The study found that the model can explain a significant proportion of the
variation in youth employment in Kenya. It also concludes that the MLR model is a reliable
tool for estimating the determinants of youth unemployment. The study concludes that the
government of Kenya needs to invest in education, skills development and access to credit to
reduce youth unemployment. The study does not consider the endogeneity of some of the
determinants of youth unemployment. For example, the study found that education is a
significant determinant of youth unemployment. However, it is possible that youth
unemployment can lead to lower levels of education. This means that the relationship
between education and youth employment may not be as straightforward as it appears. The
study focused on the theoretical findings than giving the evidence. Therefore, this study will
give both the theatrical and numerical evidences by giving the affected figures.

In a study by Mwangi and Kamau (2021), the asymptotic properties of multiple linear
regression models on determinants of youth unemployment in Kenya were investigated. The
study used data from the 2019 Kenya National Bureau of Statistics (KNBS) Youth Labor
Force Survey. The dependent variable in the model was the unemployment rate of youth aged
15-34 years. The independent variables in the model were educational attainment, gender,
region, marital status and household size. The study found that the least squares estimator is
asymptotically normal and consistent. This means that the estimator converges in distribution

10
to a normal distribution with a mean equal to the true parameter vector and variance equal to
the inverse of the Fisher information matrix. The study also found that the estimator is
efficient, which means that it has the minimum variance among all unbiased estimators of the
true parameter vector. The study also found that the model can explain a significant
proportion of variation in youth unemployment in Kenya this means that the model can
capture the most important factors that are associated with youth unemployment in Kenya.
However, the study did not consider the impact of gender on youth unemployment. Gender
discrimination is a major barrier to youth unemployment in Kenya. This study will consider
youth gender in employment.

A study by Onyango, J. and Anyango, J.M. (2020). The study used a multiple linear
regression model to analyze the determinants of youth unemployment in Kenya. The study
used data from the 2017 Kenya National Bureau of Statistics (KNBS) Labor Force Survey.
The dependent variable in the model was the unemployment rate of youth aged 15-34 years.
The independent variables in the model were educational attainment, gender, region, marital
status and household size. The study found that the significant determinants of youth
unemployment in Kenya were educational attainment, gender, region, and marital status. The
study found that the unemployment rate of youth with no education was more than twice the
unemployment rate of youth with post-secondary education. The study also found that the
unemployment rate of females was higher than the unemployment rate of males. The study
found that the unemployment rate was higher in rural areas than in urban areas. The study
also found that the unemployment rate was higher for youth who were not married. The study
did not control for other factors that may affect youth unemployment, such as economic
growth. This study proposes to use economic growth to determine the effect of youth
unemployment.

2.4 PREDICTION OF THE RATE OF UNEMPLOYMENT

A study by Kariuki and Ndirangu (2015) used a panel data analysis to investigate the
determinants of youth unemployment in Kenya. The study found that the level of education,
quality of education, availability of jobs, skills mismatch and economic environment are all
important determinants of youth unemployment in Kenya. The study also suggested that
policies that aim to reduce youth unemployment should focus on improving the level and
quality of education, increasing the availability of jobs and reducing the skills mismatch. As
much as the study determines the above determinants of the unemployment rate, the study did
not predict how population growth affects the unemployment rate among youths. Thus, this

11
study proposes to predict how population growth affects the rate of unemployment among
youths in Kenya using the MLR model.

A study by Njuguna and Kimuyu (2017) used a variety of statistical models to predict youth
unemployment in Kenya, the study found that the use of statistical models can be a valuable
tool for predicting youth unemployment in Kenya. The study's findings could help
policymakers to develop more effective policies to reduce youth unemployment. The study
used ARDL and vector error correction (VEC). This study proposes using a multiple linear
regression model to predict the youth unemployment rate in Kenya.

A study by Njoroge and Owino (2012) used a logistic regression model to investigate the
determinants of youth unemployment in Kenya. The study found that the level of education,
quality of education, region, gender and marital status are all important determinants of youth
unemployment in Kenya. The study also suggested that policies that aim to reduce youth
unemployment should focus on improving the level and quality of education, increasing the
availability of jobs in rural areas and reducing gender discrimination in the Labor market.
However, the study failed to predict how population growth and economic growth influence
the rate of unemployment. Thus, this study will use the MLR model to predict the rate of
unemployment on determinants.

2.5 SUMMARY OF RESEARCH GAP

From the above literature review, this study has established that the impacts of
unemployment among the youths in Kenya and its implication for development in Kenya
have not been adequately addressed by other scholars. It is also evident that the effectiveness
of the measures applied by the government of Kenya to combat youth unemployment has not
been adequately addressed by different scholars. This study, therefore, focuses on these gap
areas, reveal insights into the effects of each variable on youth unemployment, to examine
among other key challenges in addressing youth unemployment in Kenya and aid to inform
policy and academics to curb unemployment.

12
CHAPTER THREE

3.0 RESEARCH METHODOLOGY

3.1 INTRODUCTION

This chapter presents the research design, study area, source of data and its description,
sample size, study variables, data collection techniques and statistical models that will be
applied to data analysis.

3.2 STUDY AREA

The main study area in this research is the unemployment rate among youths, collecting
relevant data on potential determinants such as education level, gender, economic growth and
population growth in Kenya.

3.3 RESEARCH DESIGN

The cross-sectional research design will be applied in this study because data will be
collected from a sample of individuals at a single point in time to describe the characteristics,
relationships or prevalence of variable and also used to examine associations and correlations
between variables. The study will also employ descriptive and inferential statistics to
summarize and draw conclusions about population based on data collected from a sample of
that population.

3.4DATA COLLECTION

Secondary data on youth unemployment stored in the database of the Kenya National Bureau
of Statistics (KNBS) in the 2019 demographic and economic survey will be collected. The
survey will explore the employment rates among youths across various levels of education.

13
3.5 SAMPLING SIZE

One of the most common formulas used is the Yamane formula:

𝑛= 𝑁
(1 +
𝑁(𝑒)2)

n= Sample size

N=The population of the study

e=The error margin in the calculation (0.05)

𝑛= 13777600
(1 +
13777600(0.05)2)

n=400

3.6 MULTIPLE LINEAR REGRESSION MODEL

MLR is a statistical modeling technique used to analyze the relationship between a dependent
variable Y and two or more independent variables 𝑥1, 𝑥2 …. 𝑥𝑛. It extends the concept of
simple linear regression, which involves only one independent variable, to a scenario where
multiple predictors are considered simultaneously according to Dimitriadou and
Nikolakopoulos (2022). The impact of each independent variable on the dependent variable
aids in prediction, inference and decision-making processes. However, this study will assess
the model assumptions and perform model diagnostics for accurate and reliable analysis as in
Meerasri and Sothornvit. (2022)

According to Ciulla and D’Amico (2019), the MLR model equation is represented as:

𝑌 = 𝛽0 + 𝛽1 (𝑥1) + 𝛽2 (𝑥2) + 𝛽3 (𝑥3) + 𝛽4 (𝑥4) + ⋯ + 𝛽𝑛 (𝑥𝑛) + ε

Where

Y= Unemployment in Kenya

𝛽0 is the constant to be estimated by the model. Represents the intercept term which is the
expected value of Y when all independent variables are set to zero.

𝛽1 , 𝛽2 , 𝛽3 and 𝛽4 are coefficients that indicate the change in Y associated with a one-unit
14
change in each respective independent variable, holding other variables constant.

15
𝑥1 = growth rate

𝑥2 = education level

𝑥3 = Gender

𝑥4 = skills

ε = it denotes the error term, representing the unexplained variation in Y. It is not accounted
for by the independent variables.

3.6.1 ASSUMPTIONS OF THE MLR MODEL

The linear regression model relies on several key assumptions that need to be focused on
while developing this model as described in Alita., et al. (2021)

1. Linearity: There is a linear relationship between the independent and dependent


variables.
2. Independence: The observations are independent of each other.
3. Homoscedasticity: The variability of the error term is constant across all levels of the
independent variables.
4. Normality: The error term follows a normal distribution with a mean of zero.

3.6.2 PARAMETER ESTIMATION

The estimation of the coefficients 𝛽1 , 𝛽2 , 𝛽3 and 𝛽4 that best fit the observed data in the
MLR model will be done using the method of maximum likelihood estimator (MLE).
According to Myung, I. J. (2003) is a method used to find the values of parameters in a
statistical model that maximizes the likelihood of observing the given data that is a technique
to figure out the most likely values of certain unknown parameters based on the available
data.

Different metrics, including R-squared (proportion of variance explained), adjusted R-


squared (penalized for model complexity), mean squared error (average squared difference
between predicted and actual values) and significance tests for individual coefficients as
explained in Jenkins and Quintana-Ascencio. (2020) can be used to assess the effectiveness
and performance of the linear regression model.

16
3.7 ASYMPTOTIC PROPERTIES OF THE MLR MODEL

According to Tarima, S. and Flournoy, N. (2019), asymptotic properties refer to the behavior
or characteristics of a statistical estimator or a mathematical function as the sample size or
some other relevant parameter approaches infinity. This study explores properties such as
consistency, efficiency and asymptotic normality of the estimated coefficients. Additionally,
it discusses the asymptotic behavior of variance statistical tests and confidence intervals in
the context of linear regression as it will help in conducting hypothesis testing and
constructing reliable confidence intervals in the analysis of data. This study will focus on the
following properties.

3.7.1 EFFICIENCY

This refers to how well an estimator makes use of the data at hand to determine the true value
of a parameter that is according to Shen., et al (2023). When the sample size is high, an
efficient estimator offers estimates that, on average are more accurate than other estimators. It
is often evaluated using the concept of asymptotic variance, which quantifies the variability
of an estimator's estimates as the sample size grows. A smaller asymptotic variance indicates
greater precision and a more efficient estimator. Efficiency is not calculated as a direct ratio,
it is often discussed about the Cramer-Rao Lower Bound (CRLB), which provides a lower
limit on the variance of unbiased estimators. Efficiency is informally compared by calculating
the relative efficiency between two estimators. This study proposes to use relative efficiency
(RE) to determine efficiency.

RE between two estimators A and B is given by

𝑅𝐸(𝐴, Variance of the Standard


𝐵) = Estimators B Variance of
Estimators to be Assessed A

A relative efficiency close to 1 indicates that Estimator A is nearly as efficient as Estimator


B, and there's minimal difference in their variances. A relative efficiency significantly less
than 1 implies that Estimator A is less efficient compared to Estimator B.

17
3.7.2 CONSISTENCY

Consistency means that as we collect more and more data, the estimates provided by the
estimator get closer and closer to the true value of the parameter we are trying to estimate.
The estimator becomes more reliable and accurate as the sample size increases as described
by Chamidah., et al. (2022).

An estimator, denoted as Ȳ, is consistent for a parameter θ if, for any positive value ɛ, the
probability that the absolute difference between the estimator and the true value of the
parameter is larger than ɛ goes to zero as the sample size, denoted by n, tends to infinity.

Mélard, G. (2022) expressed consistency mathematically as:

lim 𝑝(|Ȳ − θ| > ɛ) = 0


n→∞

This means that the probability difference between the estimator and the true parameter value
is larger than a small value ε approaches zero as the sample size (n) becomes very large.

This study will use SPSS version 27 to perform calculations and simulations which will help
generate sample sizes and analyze the behavior of the estimator.

3.7.3 NORMALITY

Mohammedi., et al. (2021) define normality as the behavior of a statistic or an estimator as


the sample size tends to infinity. In this model, assumptions are made regarding the
distribution of the data or the sampling distribution of estimators. It is a commonly assumed
distribution due to its mathematical properties and widespread applicability. When the sample
size is large, the central limit theorem (CLT) often guarantees that the sampling distribution
of a sum or average of random variables approaches a normal distribution, regardless of the
underlying distribution of the individual variables.

del Barrio., et al. (2022) state the central limit theorem that under certain conditions, the sum
or average of a large number of independent and identically distributed random variables will
have an approximately normal distribution. In this study, the assumption of normality allows
for the application of various statistical inference techniques, such as constructing confidence
intervals or conducting hypothesis tests. MLE are known to have asymptotic normality
property which is valuable because it allows for the construction of confidence intervals,

18
hypothesis tests and other inferential procedures based on the normal distribution. The normal
distribution formula according to Hart, R. G. (1957) is given as

1 −
1 𝑥− 𝜇 2
𝑓(𝑥) = 𝑒( 2 𝜎
)
𝜎√2
𝜋

Where

𝑥 is the random variable

µ is the mean of the distribution

σ2 is the variance of the distribution

This study will use histograms to test for normality.

3.8 DATA ANALYSIS TECHNIQUE

Secondary data will be coded, entered, and analyzed using SPSS software.

3.9 ETHICAL CONSIDERATION

The research will prioritize the protection of participants’ privacy and confidentiality with all
data collected being anonymized to prevent individual identification. Prior approval from an
Institutional Review Board (IRB) will be sought to ensure compliance with ethical standards
and to safeguard the rights and well-being of the youths involved in the study.

19
APPENDICES

APPENDIX 1: TIME SCHEDULE

ACTIVITY JUN JUL AUG SEP OCT NOV DEC

CONCEPT PAPER

PROPOSAL
DEVELOPMENT

PROPOSAL DEFENCE

INSTRUMENTATION

PILOT RESEARCH

DATA COLLECTION

DATA ANALYSIS

REPORT WRITING

FINAL DEFENCE

PUBLICATION

20
APPENDIX 2: PROPOSED BUDGET

ITEM UNIT COST QUANTITY TOTAL COST

INTERNET 1500 10GB 15000

PRINTING 1000 5 5000

TRANSPORT 300 5 1500

BINDING 1000 5 5000

PUBLICATION 80000 1 50000

FACILITATION FEE 10000 2 20000

SOFTWARE PURCHASE 20000 1 20000

LAPTOP 40000 1 40000

MISCELLANEOUS 10000 1 10000

TOTAL 166500

21
REFERENCES

A study done by Sam, S. O. (2016). Modeling economic determinants of youth

unemployment in Kenya. Journal of Emerging Trends in Economics and

Management Sciences, 7(1), 31-38.

According to a study by Popirlan, C. I., Tudor, I. V., Dinu, C. C., Stoian, G., Popirlan, C., &

Dănciulescu, D. (2021). A hybrid model for unemployment impact on social

life. Mathematics, 9(18), 2278

Alita, D., Putra, A. D., & Darwis, D. (2021). Analysis of classic assumption test and multiple

linear regression coefficient test for employee structural office recommendation.

IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 15(3), 295-306.

Annor, F. B., Chiang, L. F., Oluoch, P. R., Mang'oli, V., Mogaka, M., Mwangi, M., ... & Mercy,

J. A. (2022). Changes in prevalence of violence and risk factors for violence and HIV

among children and young people in Kenya: a comparison of the 2010 and 2019

Kenya Violence Against Children and Youth Surveys. The Lancet Global Health,

10(1), e124- e133.

Chamidah, N., Lestari, B., Budiantara, I. N., Saifudin, T., Rulaningtyas, R., Aryati, A., ... &

Aydin, D. (2022). Consistency and asymptotic normality of estimator for parameters in

multiresponse multi predictor semiparametric regression model. Symmetry, 14(2), 336.

Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared

is more informative than SMAPE, MAE, MAPE, MSE, and RMSE in regression

analysis evaluation. PeerJ Computer Science, 7, e623.

22
Ciulla, G., & D'Amico, A. (2019). Building energy performance forecasting: A multiple

linear regression approach. Applied Energy, 253, 113500.

del Barrio, E., Gonzalez-Sanz, A., Loubes, J. M., & Niles-Weed, J. (2022). An improved

central limit theorem and fast convergence rates for entropic transportation costs.

arXiv preprint arXiv:2204.09105.

Dimitriadou, S., & Nikolakopoulos, K. G. (2022). Multiple linear regression models with

limited data for the prediction of reference evapotranspiration of the Peloponnese,

Greece. Hydrology, 9(7), 124.

Gachari, J. M., & Korir, J. K. (2020). Effect of fiscal policy on unemployment in

Kenya. Journal of Economics and Finance (IOSR-JEF), 11(1), 19-31.

Hart, R. G. (1957). A formula for the approximation of definite integrals of the normal

distribution function. Mathematical Tables and Other Aids to Computation, 11(60),

265.

Horel, E., & Giesecke, K. (2020). Significance tests for neural networks. The Journal of

Machine Learning Research, 21(1), 9291-9319

International Labor Organization (ILO). (2019). Global Employment Trends for Youth 2019:

Towards decent work for all young people. Geneva, Switzerland: ILO.

Jenkins, D. G., & Quintana-Ascencio, P. F. (2020). A solution to minimum sample size for

regressions. PloS one, 15(2), e0229345.

Kahn, R. F. (2022). Unemployment as seen by the Keynesians. In Richard F. Kahn:

Collected Economic Essays (pp. 225-239). Cham: Springer International Publishing.

Kariuki, J., & Ndirangu, M. (2015). Determinants of youth unemployment in Kenya: A panel

data analysis. Journal of Economics and Sustainable Development, 6(12), 129-140.

Kenya National Bureau of Statistics (KNBS). (2018). Labor Force Survey 2018. Nairobi,

Kenya: KNBS.

23
Kien, A., & Kariuki, S. (2018). The youth unemployment crisis in Kenya: Causes and options.

KEEI Working Paper 2018-04.

Kiiru, J. M., & Barasa, L. N. (2020). Securing Inclusive Growth: Mentorship and Youth

Employment in Kenya. Africa and the Sustainable Development Goals, 145-154.

Koskei, J. C., & Otinga, H. N. (2021). Influence of Internal Audit Standards on Financial

Sustainability in County Governments; A Case of Kericho County Government,

Kenya. IOSR Journal of Economics and Finance, 12(4-3), 49-56.

Maina, W. K., Ndegwa, Z. M., Njenga, E. W., & Muchemi, E. W. (2010). Knowledge,

attitude, and practices related to diabetes among community members in four

provinces in Kenya: a cross-sectional study. Pan African Medical Journal, 7(1).

Maul, D. (2020). The International Labor Organization (p. 301). de Gruyter.

Mbogo, M., & Ouma, B. (2019). The determinants of youth unemployment in Kenya: A

systematic review. Journal of Economics and Sustainable Development, 10(2), 160-176

Meerasri, J., & Sothornvit, R. (2022). Artificial neural networks (ANNs) and multiple linear

regression (MLR) for prediction of moisture content for coated pineapple cubes. Case

Studies in Thermal Engineering, 33, 101942.

Mélard, G. (2022). An indirect proof for the asymptotic properties of VARMA model

estimators. Econometrics and Statistics, 21, 96-111.

Mohammedi, M., Bouzebda, S., & Laksaci, A. (2021). The consistency and asymptotic

normality of the kernel type executive regression estimator for functional data.

Journal of Multivariate Analysis, 181, 104673.

Monari, F. N., Tinega, R., & Agasa, L. O. (2020). Modeling economic determinants of youth

unemployment using multiple linear regression: A case study of Kenya. International

journal of statistics and applied mathematics.

24
Muriithi, J. M., & Kimani, J. M. (2022). Determinants of youth unemployment in Kenya: A

multiple linear regression analysis. International Journal of Social Economics, 49(1),

115-130.

Mwangi, J. N., & Kamau, J. M. (2021). Asymptotic properties of multiple linear regression

model on determinants of youth unemployment in Kenya. International Journal of

Economics and Financial Issues, 11(4), 133-141.

Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal of Mathematical

Psychology, 47(1), 90-100.

Njiru, G. (2020). Implementing Article 43 (1)(c) of the constitution; right to food in

Kenya (Doctoral dissertation, UoN).

Njoroge, C., & Owino, M. (2012). Determinants of youth unemployment in Kenya: A logistic

regression analysis. Journal of Economics and Sustainable Development, 3(11), 135-

148.

Njuguna-Mungai, E., Sivi-Njonjo, K., & Nchanji, E. B. (2019). Gendered youth transitions to

adulthood in the Drylands: Implications for targeting.

Onyango, J. M., & Anyango, J. M. (2020). Determinants of youth unemployment in Kenya:

A multiple linear regression analysis. Journal of Economics and Sustainable

Development, 11(15), 17-27.

Ouko, K. O., Ogola, J. R. O., Ng’on’ga, C. A., & Wairimu, J. R. (2022). Youth involvement

in agripreneurship as Nexus for poverty reduction and rural employment in

Kenya. Cogent Social Sciences, 8(1), 2078527.

Research by Dumicic, K., Bucevska, V., & Resic, E. (2015). On recent Impacts of Selected

Development Indicators on Unemployment Rate: Focusing the SEE

Countries. Interdisciplinary Description of Complex Systems: INDECS, 13(3), 420-

433

25
Shen, X., Jiang, C., Sakhanenko, L., & Lu, Q. (2023). Asymptotic properties of neural network

sieve estimators. Journal of Nonparametric Statistics, 1-30.

26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy