ML Interpretability Assignment
ML Interpretability Assignment
Submitted by,
F20022 – G.L. Krishna Priya
F20077 – Ashwin.V
F20102 – Melvin Jose P G
F20117 – Souradeep Pal
F20139 – J R Dinesh
Machine Learning Interpretability Assignment
1. Which are the models that can explain the features better? Which models cannot
explain better?
The models that can explain features better and are more interpretable are:
Linear Regression
Logistic Regression
Decision Trees
When the reasoning behind the predictions and decisions readily understandable, it is
considered interpretable.
If the model is linear, monotonic and the output is available at the defined rate it also
helps. The 3 methods are also more interpretable because there is a strong mathematical
base amongst the models. All are steeped in mathematics and mathematical equations
in calculations aside from being readily understandable.
XG Boosting
Random Forests
Neural Networks
For random forests, the no. of trees can range up to a large amount and this can make the
process far more complicated as opposed to Decision trees. Likewise, the same can be
said be said for Neural Networks and Deep Neural Networks where the no. of neurons can
range to a high amount(no. of weights). XGBoosting and Gradient boosting are similar.
Machine Learning Interpretability Assignment
2. Rank the Models based on which can explain the features better.
1. Linear Regression
2. Logistic Regression
3. Decision Tree
These models explain features well. Linear Regression is not only linear and monotone but
the output changes at the defined rate.
The mathematical calculations are done based on the y=mx + C formula to determine how
much result(Y) will depend on a variable. Logistic Regression uses the logit function and
is also based on calculations and mathematics which increases interpretably. Decision Tree
is easy to understand and follows Sum of Product.
Also cost of each option along the decision line is calculated for Decision tree analysis.
4. Random Forest
Random Forest is ranked low because of the no. of trees within the forest not being fixed
and the calculations not being fixed and steeped in math as well.
Neural Network is more advanced with the no. of weights being very high in addition to
calculation complexities caused by the neurons.
Machine Learning Interpretability Assignment
a) Feature Importance:
We can use Feature of Importance. We can thus use this to see the contribution of
variables and see which variables explain the accuracy better.
Often, it is found that around 20% of the variables explain the accuracy very well as
opposed to the rest 80%. We can use this to just select the important variables and
reduce the unnecessary complexity and increase interpretability by making it more
understandable.
b) LIME Method: We can also use the LIME Method (Locally Interpretable Model
agnostic Explanation). This is a method used for black-box models (such as the lower
ranked ones). Here we look at the subsets of data instead of the whole data and
evaluate. LIME explains a prediction so that non-experts can compare and improve on
a model.
c) Sharpley Value: We can use the Sharpley value. It is the average expected marginal
contribution of a single player after every possible combination is taken into account.
Machine Learning Interpretability Assignment
It determines the payoff for all the players when each might have contributed more or
less than the others. It is used in Game Theory. It finds out the contribution of the
players.
So first, the Model is built, then after each iteration the feature importance is assigned.
The features will be explained better and bias scripts will be taken care of in the
model(removal of systemically prejudiced results due to wrongful assumptions).