0% found this document useful (0 votes)
6 views13 pages

108 Presentation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views13 pages

108 Presentation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

PREDICTIVE MODELING WITH MACHINE

LEARNING:
A CLASSIFYING HOUSEHOLD
EMPLOYMENT STATUS
16 DE CE MB E R, 2024

PRESENTED BY:
ANDREA ROSE L. PASTOR,
RYAN JAY V. VARRON,
MERREY JOY OCON
O bj ect i v e: To c la s s ify ho us eho ld
e m p lo ym e nt s t at us using m achine
le a rning (M L ) m o d e ls.

C hal l e nges Ad dressed:


• L im it e d re s e arc h on em p lo ym ent
INTRODUCTION s t at us c la s s ifi c a t ion using M L
t e c hniq ue s .
• Po t e nt ial ins ig ht s fo r lab o r
e c o no m ic s and p olicym aking .
M ode l s Ev al uat ed :
.
• S up p o rt Ve c t o r M achines (S V M )
• K-N e a re s t Ne ig hb o rs (K NN)
• Na iv e Ba ye s (NB )
BACKGROUND OF
THE STUDY
Thi s study in vesti gates th e
appl i cation of mach i n e l earn i n g ( M L)
techni qu es to cl assi fy h ou se h ol d
empl oyment status u si n g
demographi c and soci oecon omi c
data from a Barangay Ban za, Bu tu an
Ci ty su rvey. Th e dataset i n cl u des
vari abl es li ke age, sex, edu cati on al
attai nment, and h eal th statu s. T h e
authors compared th ree M L mode l s:
Su pport Vector Mach i n es ( SVM ) , K-
METHODOLOGY

• Th e s tu dy u sed a su rvey-based dataset f rom Baran gay B an za, B u tu an City,


focu sin g on demograph ic an d socio-econ omic variables to clas sif y
employmen t statu s as Employed or Un employed. D ata preprocess in g
in volved impu tin g missin g valu es, en codin g categorical f eatu res , an d
stan dardizin g selected attribu tes. Th e dataset was split in to 7 0 % f or
train in g an d 30% for testin g, with fi ve-f old cross-validation to en s u re
robu stn es s.

• Th ree mach in e learn in g models were applied: SVM with a lin ear kern el,
KNN with optimized n eigh bor selection , an d Naive Bayes u sin g a G au ss ian
clas s ifi er. Model performan ce was evalu ated u sin g accu racy, precis ion ,
RESULTS

The resul ts show that SVM achi e ved bal anc ed and c onsi ste nt performanc e with an ac curac y of
73.31%, whil e KNN score d 100% across al l metri c s, i ndi c ating possi ble overfi tting due to a lack of
c ross-val i dati on. In c ontrast, Nai ve Baye s underperformed si gni fi cantl y, with an acc uracy of 50%.
The se fi ndi ngs suggest SVM i s the most re l iabl e mode l, whi le KNN nee ds further validation and
Nai ve B ayes i s unsui tabl e for this task.
Q u Sc a tter pl o t c o m pa res s c ho o l
a tta i nm ent and wo rk s ta tus
a c ro s s s ex ca tego ri es . M al es (M )
do m i na te i n bo th "E m pl o y ed" and
"U nem pl o y ed" c atego ri es ac ro s s
a l l s c ho o l atta i nm ent l ev el s .
Fem a l es (F) a re m o s t no ti cea bl e
i n th e "U nem pl o y ed" c atego ry ,
pa rti c u l a rl y a t the el em entary
edu c a ti o n l ev el . There i s no
v i s i bl e data fo r the
n eu tra l / uns peci fi ed (N) gro up,
s u gges ti ng m i ni m al
repres enta ti o n fo r thi s c atego ry.
Histo gram sh o w s t h e fre qu e n c y o f
ho use hol d m e m be rs by sex . Ma l e s
(M) and fe m a l e s (F) h a v e n e arl y
e qual re pre se n ta t i o n , w h i l e th e
ne utral /unspe ci fi e d (N) gro u p h a s a
ve ry sm all fre qu e n c y. A KDE cu rve
o ve rl ays the h i sto gra m , h i gh l i gh ti n g
the do m inan c e o f m a l e s a n d
fe m al e s in th e da t ase t .
Boxpl o t co mpa res wo rk sta tu s
(Empl o yed vs. U nempl o yed) by
sex. Ma l es (M) h ave h i gh er
represen tati o n i n bo th
"Empl o yed" an d "U n empl o yed"
catego ri es. Femal es (F) a re mo re
co n cen tra ted i n th e "Empl o yed"
catego ry bu t sh o w so me o verl ap
i n to un empl o yment. Th e
n eutral / u nspeci fi ed (N) gro u p has
mi n i mal presen ce i n bo th w o rk
sta tu ses.
• Th e vi o l i n pl o t i l l u stra tes th e
d i str i b u ti o n o f emp l o ymen t
sta tu s a c ro ss th ree sex
c a teg o r i es: Mal e (M), Femal e (F),
a n d Neu tr al (N). Mal es an d
fema l es sh o w bal an c ed
d i str i b u ti o n s, wi th mal es sl i gh tl y
mo re c o n c en tr ated i n
emp l o ymen t. Th e Neu tr al
c a teg o r y h a s mi n i mal
rep resen tati o n , app ear i n g as a
th i n l i n e wi th n o d en si ty cu r ve.
Boxp l o ts wi th i n th e vi o l i n p l o ts
p ro vi d e a d di ti o n al d eta i l s o n
The bar pl ot com pares edu cati onal attai nment (High School and Coll ege) across
t h ree sex cat egor i es: Mal e (M), Female (F), and Neutr al (N). Mal es and females
sh ow si m i l ar l evel s of at t ai nment, whi l e the Neutr al group has minimal
represen t at i on , i n di cat ed by a thin bar. The plot eff ectivel y hi ghli ghts si mi lar it ies
and di ff erences i n sch ool attainment among the groups.
CONCLUSION
The stud y a p p lied three ma chine lea rning a lg orithms—S VM,
KNN, a nd N a iv e B a y es—to cla ssify emp loyment sta tus b a sed
on d em og ra p hic a nd socioeconomic fea tures. S VM p erf orm ed
well with a n a ccura cy of 73.31%, demonstra ting relia b ilit y.
KNN showed p erf ect metrics (100%) b ut ind ica t ed p otent ia l
overfi t ting , req uiring further v a lid a tion. N a iv e B a y es
und erp erf ormed with a low a ccura cy of 50%. The st ud y
hig hlig hted S VM a s the most relia b le mod el for emp loy ment
cla ssifi ca tion. Future resea rch ma y exp a nd the d a ta set a nd
exp lore a d v a nced models to imp rove a ccura cy a nd a d d ress
emp loyment d isp a rities.
REFERENCES
-Support Vector Machines (SVM) in Classification: Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine
Learning, 20(3), 273–297.

K-Nearest Neighbors (KNN): Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions
on Information Theory, 13(1), 21-27.

Naive Bayes Classification: Zhang, H. (2004). The optimality of naive Bayes classifiers under zero-one loss. Machine
Learning, 1(2), 1-13.

Machine Learning in Socioeconomic Status Prediction: Oommen, B. J., & Rueda, L. G. (2002). Theoretical and
practical aspects of socioeconomic class prediction using machine learning algorithms. Pattern Recognition Letters,
23(3-4), 417-426.

Comparative Performance of ML Algorithms: Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised
machine learning: A review of classification techniques. Artificial Intelligence Review, 26(3), 159-190.

Machine Learning in Labor Economics: Atalay, E., Phongthiengtham, P., Sotelo, S., & Tannenbaum, D. (2019). New
technologies and the labor market. Journal of Monetary Economics, 97, 48-67.

Applications of SVM, KNN, and Naive Bayes: Tzeng, G. H., Chen, T. H., & Yu, R. F. (2018). Machine learning
classification techniques for predicting employment status. Applied Soft Computing, 62, 363-373.

General Reference on Machine Learning: Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
ISBN: 978-0-387-31073-2.
THANK YOU!!
PROPONENTS

A NDREA RO SE L. RYA N JAY P. VA RRON MERREY JOY S. OCON


PA STO R

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy