0% found this document useful (0 votes)

10 views2 pages

Absenteeism Module

The document contains a Python script for a custom absenteeism prediction model using machine learning. It includes a CustomScaler class for data preprocessing and an absenteeism_model class that handles data loading, cleaning, and prediction. The script processes absenteeism data by transforming features and predicting probabilities and outputs based on the trained model.

Uploaded by

kmzdr1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views2 pages

Absenteeism Module

Uploaded by

kmzdr1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

2/16/2019 Dropbox - The 5 files rar - Simplify your life

1 # coding: utf-8
2
3 # In[1]:
4
5
6 # import all libraries needed
7 import numpy as np
8 import pandas as pd
9 import pickle
10 from sklearn.preprocessing import StandardScaler
11 from sklearn.base import BaseEstimator, TransformerMixin
12
13 # the custom scaler class
14 class CustomScaler(BaseEstimator,TransformerMixin):
15
16 def __init__(self,columns,copy=True,with_mean=True,with_std=True):
17 self.scaler = StandardScaler(copy,with_mean,with_std)
18 self.columns = columns
19 self.mean_ = None
20 self.var_ = None
21
22 def fit(self, X, y=None):
23 self.scaler.fit(X[self.columns], y)
24 self.mean_ = np.array(np.mean(X[self.columns]))
25 self.var_ = np.array(np.var(X[self.columns]))
26 return self
27
28 def transform(self, X, y=None, copy=None):
29 init_col_order = X.columns
30 X_scaled = pd.DataFrame(self.scaler.transform(X[self.columns]), columns=self.columns)
31 X_not_scaled = X.loc[:,~X.columns.isin(self.columns)]
32 return pd.concat([X_not_scaled, X_scaled], axis=1)[init_col_order]
33
34
35 # create the special class that we are going to use from here on to predict new data
36 class absenteeism_model():
37
38 def __init__(self, model_file, scaler_file):
39 # read the 'model' and 'scaler' files which were saved
40 with open('model','rb') as model_file, open('scaler', 'rb') as scaler_file:
41 self.reg = pickle.load(model_file)
42 self.scaler = pickle.load(scaler_file)
43 self.data = None
44
45 # take a data file (*.csv) and preprocess it in the same way as in the lectures
46 def load_and_clean_data(self, data_file):
47
48 # import the data
49 df = pd.read_csv(data_file,delimiter=',')
50 # store the data in a new variable for later use
51 self.df_with_predictions = df.copy()
52 # drop the 'ID' column
53 df = df.drop(['ID'], axis = 1)
54 # to preserve the code we've created in the previous section, we will add a column with 'NaN' strings
55 df['Absenteeism Time in Hours'] = 'NaN'
56
57 # create a separate dataframe, containing dummy values for ALL avaiable reasons
58 reason_columns = pd.get_dummies(df['Reason for Absence'], drop_first = True)
59
60 # split reason_columns into 4 types
61 reason_type_1 = reason_columns.loc[:,1:14].max(axis=1)
62 reason_type_2 = reason_columns.loc[:,15:17].max(axis=1)
63 reason_type_3 = reason_columns.loc[:,18:21].max(axis=1)
64 reason_type_4 = reason_columns.loc[:,22:].max(axis=1)
65
66 # to avoid multicollinearity, drop the 'Reason for Absence' column from df
67 df = df.drop(['Reason for Absence'], axis = 1)
68
69 # concatenate df and the 4 types of reason for absence
70 df = pd.concat([df, reason_type_1, reason_type_2, reason_type_3, reason_type_4], axis = 1)
71
72 # assign names to the 4 reason type columns
73 # note: there is a more universal version of this code, however the following will best suit our current purposes
74 column_names = ['Date', 'Transportation Expense', 'Distance to Work', 'Age',
75 'Daily Work Load Average', 'Body Mass Index', 'Education', 'Children',
76 'Pet', 'Absenteeism Time in Hours', 'Reason_1', 'Reason_2', 'Reason_3', 'Reason_4']
77 df.columns = column_names
78
79 # re-order the columns in df
80 column_names_reordered = ['Reason_1', 'Reason_2', 'Reason_3', 'Reason_4', 'Date', 'Transportation Expense',
81 'Distance to Work', 'Age', 'Daily Work Load Average', 'Body Mass Index', 'Education',
82 'Children', 'Pet', 'Absenteeism Time in Hours']
83 df = df[column_names_reordered]
84
85 # convert the 'Date' column into datetime
86 df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
87
88 # create a list with month values retrieved from the 'Date' column
89 list_months = []
90 for i in range(df.shape[0]):
91 list_months.append(df['Date'][i].month)
92
93 # insert the values in a new column in df, called 'Month Value'
94 df['Month Value'] = list_months
95
96 # create a new feature called 'Day of the Week'
97 df['Day of the Week'] = df['Date'].apply(lambda x: x.weekday())
98
99
100 # drop the 'Date' column from df
101 df = df.drop(['Date'], axis = 1)
102
103 # re-order the columns in df
104 column_names_upd = ['Reason_1', 'Reason_2', 'Reason_3', 'Reason_4', 'Month Value', 'Day of the Week',
105 'Transportation Expense', 'Distance to Work', 'Age',
106 'Daily Work Load Average', 'Body Mass Index', 'Education', 'Children',
107 'Pet', 'Absenteeism Time in Hours']
108 df = df[column_names_upd]

https://www.dropbox.com/sh/t536dzy3h9dimjp/AAD-H2Av7myydXkFjMacBUMMa?dl=0&preview=absenteeism_module.py 1/2
2/16/2019 Dropbox - The 5 files rar - Simplify your life
109
110
111 # map 'Education' variables; the result is a dummy
112 df['Education'] = df['Education'].map({1:0, 2:1, 3:1, 4:1})
113
114 # replace the NaN values
115 df = df.fillna(value=0)
116
117 # drop the original absenteeism time
118 df = df.drop(['Absenteeism Time in Hours'],axis=1)
119
120 # drop the variables we decide we don't need
121 df = df.drop(['Day of the Week','Daily Work Load Average','Distance to Work'],axis=1)
122
123 # we have included this line of code if you want to call the 'preprocessed data'
124 self.preprocessed_data = df.copy()
125
126 # we need this line so we can use it in the next functions
127 self.data = self.scaler.transform(df)
128
129 # a function which outputs the probability of a data point to be 1
130 def predicted_probability(self):
131 if (self.data is not None):
132 pred = self.reg.predict_proba(self.data)[:,1]
133 return pred
134
135 # a function which outputs 0 or 1 based on our model
136 def predicted_output_category(self):
137 if (self.data is not None):
138 pred_outputs = self.reg.predict(self.data)
139 return pred_outputs
140
141 # predict the outputs and the probabilities and
142 # add columns with these values at the end of the new data
143 def predicted_outputs(self):
144 if (self.data is not None):
145 self.preprocessed_data['Probability'] = self.reg.predict_proba(self.data)[:,1]
146 self.preprocessed_data ['Prediction'] = self.reg.predict(self.data)
147 return self.preprocessed_data

https://www.dropbox.com/sh/t536dzy3h9dimjp/AAD-H2Av7myydXkFjMacBUMMa?dl=0&preview=absenteeism_module.py 2/2

Data Cleaning - Cheatsheet
100% (2)
Data Cleaning - Cheatsheet
8 pages
(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
Catalogue Career Paths Courses
No ratings yet
Catalogue Career Paths Courses
6 pages
DA Lab
No ratings yet
DA Lab
27 pages
DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
Data Preprocessing
No ratings yet
Data Preprocessing
18 pages
2022ucd2164 1 2
No ratings yet
2022ucd2164 1 2
35 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
ML LAB Manual-1
No ratings yet
ML LAB Manual-1
33 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
DA Lab Manual r22
No ratings yet
DA Lab Manual r22
31 pages
DA Programs
No ratings yet
DA Programs
44 pages
Advance Python
No ratings yet
Advance Python
5 pages
Part A Assignment 6
No ratings yet
Part A Assignment 6
28 pages
Building Logistic Regression Model in Python
No ratings yet
Building Logistic Regression Model in Python
24 pages
Day-4 DS Practicals
No ratings yet
Day-4 DS Practicals
5 pages
ML - Preprocessing - Introduction
No ratings yet
ML - Preprocessing - Introduction
14 pages
Week 6 - Data Cleaning
No ratings yet
Week 6 - Data Cleaning
8 pages
Featureselection
No ratings yet
Featureselection
11 pages
MACHINE LEARNING Manual
No ratings yet
MACHINE LEARNING Manual
36 pages
Komal ML Assg1
No ratings yet
Komal ML Assg1
9 pages
Practical No-2
No ratings yet
Practical No-2
4 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
DS Problem Statements and Codes
No ratings yet
DS Problem Statements and Codes
21 pages
Exp 8 - LM
No ratings yet
Exp 8 - LM
10 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
1 - DataPreparation - Ipynb - Colaboratory
No ratings yet
1 - DataPreparation - Ipynb - Colaboratory
8 pages
Data Preprocessing 1
No ratings yet
Data Preprocessing 1
6 pages
Da Program Upto 6
No ratings yet
Da Program Upto 6
20 pages
Avinash DA 6
No ratings yet
Avinash DA 6
3 pages
cdp201 10 11 2023
No ratings yet
cdp201 10 11 2023
17 pages
ML Complete Notes Hridoy
No ratings yet
ML Complete Notes Hridoy
5 pages
AIML 01 Merged
No ratings yet
AIML 01 Merged
25 pages
DSBDA Lab Assignment No 2
No ratings yet
DSBDA Lab Assignment No 2
7 pages
Data Science Practicals
No ratings yet
Data Science Practicals
47 pages
Srushti ML Assign1
No ratings yet
Srushti ML Assign1
9 pages
PROJECTS
No ratings yet
PROJECTS
6 pages
Saurabh
No ratings yet
Saurabh
22 pages
Sanket ML Assign1
No ratings yet
Sanket ML Assign1
9 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
Lab 4
No ratings yet
Lab 4
3 pages
Machine File
No ratings yet
Machine File
27 pages
Modelling and Simmulation Assignment - Ipynb - Colab
No ratings yet
Modelling and Simmulation Assignment - Ipynb - Colab
7 pages
Step-by-Step Explanation of Python Data Preprocessing Script
No ratings yet
Step-by-Step Explanation of Python Data Preprocessing Script
9 pages
TYCS Practical
No ratings yet
TYCS Practical
26 pages
Ap Python
No ratings yet
Ap Python
12 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
AL Notes
No ratings yet
AL Notes
61 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
66 pages
Healthcare-Project-Simplilearn - Week3
No ratings yet
Healthcare-Project-Simplilearn - Week3
7 pages
ASSi2 DSBDA
No ratings yet
ASSi2 DSBDA
4 pages
Exp-2 ML
No ratings yet
Exp-2 ML
6 pages
Dsbda Lab - 2.1 - 1736750718198
No ratings yet
Dsbda Lab - 2.1 - 1736750718198
9 pages
Index: SR. NO. Practical Name Date of Perform NO. Sign
No ratings yet
Index: SR. NO. Practical Name Date of Perform NO. Sign
28 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Indexdw
No ratings yet
Indexdw
34 pages
Data Analysis W Pandas
No ratings yet
Data Analysis W Pandas
4 pages
Data Preprocessing Example Programs1
No ratings yet
Data Preprocessing Example Programs1
9 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Descriptive GS Notes
No ratings yet
Descriptive GS Notes
1 page
SQL Task List
No ratings yet
SQL Task List
2 pages
Keystone 0
No ratings yet
Keystone 0
1 page
SP20-RCM-050 Representation of Data
No ratings yet
SP20-RCM-050 Representation of Data
9 pages
NEWSVAULT 1st To 31st May 2024 B W Lyst1717826669723
No ratings yet
NEWSVAULT 1st To 31st May 2024 B W Lyst1717826669723
115 pages
Advanced Excel: Multiple Worksheets
No ratings yet
Advanced Excel: Multiple Worksheets
9 pages
Psalm 131 As Prayer N Trust
No ratings yet
Psalm 131 As Prayer N Trust
13 pages
54 TH Nfa Brochure
No ratings yet
54 TH Nfa Brochure
200 pages
Windows Active Directory, Backup & VMWare Interview Questions
No ratings yet
Windows Active Directory, Backup & VMWare Interview Questions
6 pages
Bisayan Studies
No ratings yet
Bisayan Studies
10 pages
Present Perfect vs. Past Simple & Past Continuous
No ratings yet
Present Perfect vs. Past Simple & Past Continuous
6 pages
Grade 9 Annual Exam Timetable With Portions
No ratings yet
Grade 9 Annual Exam Timetable With Portions
4 pages
TMJC H2 Mathematics Prelims Paper 2 (Q)
No ratings yet
TMJC H2 Mathematics Prelims Paper 2 (Q)
25 pages
2.0. Mathematical Language and Symbols Including Sets and Functions
No ratings yet
2.0. Mathematical Language and Symbols Including Sets and Functions
69 pages
Konikow chapter14MOD
No ratings yet
Konikow chapter14MOD
16 pages
Employee Leave Management System
No ratings yet
Employee Leave Management System
28 pages
String Handling
No ratings yet
String Handling
33 pages
Program Outcomes: Doctor of Philosophy in Development Education (Ph.D. Deved)
No ratings yet
Program Outcomes: Doctor of Philosophy in Development Education (Ph.D. Deved)
5 pages
Swing Bench 21 F
No ratings yet
Swing Bench 21 F
29 pages
Django Ninja
No ratings yet
Django Ninja
10 pages
m-715 Writing
No ratings yet
m-715 Writing
81 pages
C Program To Implement A Stack: Problem Description
No ratings yet
C Program To Implement A Stack: Problem Description
9 pages
Will The Humanities Survive Artificial Intelligence - The New Yorker
No ratings yet
Will The Humanities Survive Artificial Intelligence - The New Yorker
39 pages
Language Teaching Beliefs Questionnaire
No ratings yet
Language Teaching Beliefs Questionnaire
2 pages
Learn These 4 Word Stress Rules To Improve Your Pronunciation
No ratings yet
Learn These 4 Word Stress Rules To Improve Your Pronunciation
5 pages
Power Cloud For Technical Sales - Part 2 Private Cloud Quiz - Attempt Review
No ratings yet
Power Cloud For Technical Sales - Part 2 Private Cloud Quiz - Attempt Review
14 pages
Great Events in Religion 3 Volumes An Encyclopedia of Pivotal Events in Religious History Instant Download
No ratings yet
Great Events in Religion 3 Volumes An Encyclopedia of Pivotal Events in Religious History Instant Download
82 pages
Computer Basics Lesson Plan One
No ratings yet
Computer Basics Lesson Plan One
6 pages
Creative Writing
No ratings yet
Creative Writing
17 pages
English - Literature 3 6 2017
No ratings yet
English - Literature 3 6 2017
12 pages
Netaji Notes
No ratings yet
Netaji Notes
2 pages
How To Help Your Child Succeed in School Spanish
No ratings yet
How To Help Your Child Succeed in School Spanish
1 page
Flash Cards
0% (1)
Flash Cards
6 pages
Preschool-Thaa Arabic
100% (1)
Preschool-Thaa Arabic
14 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Absenteeism Module

Uploaded by

Absenteeism Module

Uploaded by

2/16/2019 Dropbox - The 5 files rar - Simplify your life

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.