|
678 | 678 | "ans_dist = enumeration_ask('Burglary', {'JohnCalls': True, 'MaryCalls': True}, burglary)\n",
|
679 | 679 | "ans_dist[True]"
|
680 | 680 | ]
|
| 681 | + }, |
| 682 | + { |
| 683 | + "cell_type": "markdown", |
| 684 | + "metadata": {}, |
| 685 | + "source": [ |
| 686 | + "### Variable Elimination\n", |
| 687 | + "\n", |
| 688 | + "The enumeration algorithm can be improved substantially by eliminating repeated calculations. In enumeration we join the joint of all hidden variables. This is of exponential size for the number of hidden variables. Variable elimination employes interleaving join and marginalization.\n", |
| 689 | + "\n", |
| 690 | + "Before we look into the implementation of Variable Elimination we must first familiarize ourselves with Factors. \n", |
| 691 | + "\n", |
| 692 | + "In general we call a multidimensional array of type P(Y1 ... Yn | X1 ... Xm) a factor where some of Xs and Ys maybe assigned values. Factors are implemented in the probability module as the class **Factor**. They take as input **variables** and **cpt**. \n", |
| 693 | + "\n", |
| 694 | + "\n", |
| 695 | + "#### Helper Functions\n", |
| 696 | + "\n", |
| 697 | + "There are certain helper functions that help creating the **cpt** for the Factor given the evidence. Let us explore them one by one." |
| 698 | + ] |
| 699 | + }, |
| 700 | + { |
| 701 | + "cell_type": "code", |
| 702 | + "execution_count": null, |
| 703 | + "metadata": { |
| 704 | + "collapsed": true |
| 705 | + }, |
| 706 | + "outputs": [], |
| 707 | + "source": [ |
| 708 | + "%psource make_factor" |
| 709 | + ] |
| 710 | + }, |
| 711 | + { |
| 712 | + "cell_type": "markdown", |
| 713 | + "metadata": {}, |
| 714 | + "source": [ |
| 715 | + "**make_factor** is used to create the **cpt** and **variables** that will be passed to the constructor of **Factor**. We use **make_factor** for each variable. It takes in the arguments **var** the particular variable, **e** the evidence we want to do inference on, **bn** the bayes network.\n", |
| 716 | + "\n", |
| 717 | + "Here **variables** for each node refers to a list consisting of the variable itself and the parents minus any variables that are part of the evidence. This is created by finding the **node.parents** and filtering out those that are not part of the evidence.\n", |
| 718 | + "\n", |
| 719 | + "The **cpt** created is the one similar to the original **cpt** of the node with only rows that agree with the evidence." |
| 720 | + ] |
| 721 | + }, |
| 722 | + { |
| 723 | + "cell_type": "code", |
| 724 | + "execution_count": null, |
| 725 | + "metadata": { |
| 726 | + "collapsed": true |
| 727 | + }, |
| 728 | + "outputs": [], |
| 729 | + "source": [ |
| 730 | + "%psource all_events" |
| 731 | + ] |
| 732 | + }, |
| 733 | + { |
| 734 | + "cell_type": "markdown", |
| 735 | + "metadata": {}, |
| 736 | + "source": [ |
| 737 | + "The **all_events** function is a recursive generator function which yields a key for the orignal **cpt** which is part of the node. This works by extending evidence related to the node, thus all the output from **all_events** only includes events that support the evidence. Given **all_events** is a generator function one such event is returned on every call. \n", |
| 738 | + "\n", |
| 739 | + "We can try this out using the example on **Page 524** of the book. We will make **f**<sub>5</sub>(A) = P(m | A)" |
| 740 | + ] |
| 741 | + }, |
| 742 | + { |
| 743 | + "cell_type": "code", |
| 744 | + "execution_count": null, |
| 745 | + "metadata": { |
| 746 | + "collapsed": false |
| 747 | + }, |
| 748 | + "outputs": [], |
| 749 | + "source": [ |
| 750 | + "f5 = make_factor('MaryCalls', {'JohnCalls': True, 'MaryCalls': True}, burglary)" |
| 751 | + ] |
| 752 | + }, |
| 753 | + { |
| 754 | + "cell_type": "code", |
| 755 | + "execution_count": null, |
| 756 | + "metadata": { |
| 757 | + "collapsed": false |
| 758 | + }, |
| 759 | + "outputs": [], |
| 760 | + "source": [ |
| 761 | + "f5" |
| 762 | + ] |
| 763 | + }, |
| 764 | + { |
| 765 | + "cell_type": "code", |
| 766 | + "execution_count": null, |
| 767 | + "metadata": { |
| 768 | + "collapsed": false |
| 769 | + }, |
| 770 | + "outputs": [], |
| 771 | + "source": [ |
| 772 | + "f5.cpt" |
| 773 | + ] |
| 774 | + }, |
| 775 | + { |
| 776 | + "cell_type": "code", |
| 777 | + "execution_count": null, |
| 778 | + "metadata": { |
| 779 | + "collapsed": false |
| 780 | + }, |
| 781 | + "outputs": [], |
| 782 | + "source": [ |
| 783 | + "f5.variables" |
| 784 | + ] |
| 785 | + }, |
| 786 | + { |
| 787 | + "cell_type": "markdown", |
| 788 | + "metadata": {}, |
| 789 | + "source": [ |
| 790 | + "Here **f5.cpt** False key gives probability for **P(MaryCalls=True | Alarm = False)**. Due to our representation where we only store probabilities for only in cases where the node variable is True this is the same as the **cpt** of the BayesNode. Let us try a somewhat different example from the book where evidence is that the Alarm = True" |
| 791 | + ] |
| 792 | + }, |
| 793 | + { |
| 794 | + "cell_type": "code", |
| 795 | + "execution_count": null, |
| 796 | + "metadata": { |
| 797 | + "collapsed": true |
| 798 | + }, |
| 799 | + "outputs": [], |
| 800 | + "source": [ |
| 801 | + "new_factor = make_factor('MaryCalls', {'Alarm': True}, burglary)" |
| 802 | + ] |
| 803 | + }, |
| 804 | + { |
| 805 | + "cell_type": "code", |
| 806 | + "execution_count": null, |
| 807 | + "metadata": { |
| 808 | + "collapsed": false |
| 809 | + }, |
| 810 | + "outputs": [], |
| 811 | + "source": [ |
| 812 | + "new_factor.cpt" |
| 813 | + ] |
| 814 | + }, |
| 815 | + { |
| 816 | + "cell_type": "markdown", |
| 817 | + "metadata": {}, |
| 818 | + "source": [ |
| 819 | + "Here the **cpt** is for **P(MaryCalls | Alarm = True)**. Therefore the probabilities for True and False sum up to one. Note the difference between both the cases. Again the only rows included are those consistent with the evidence.\n", |
| 820 | + "\n", |
| 821 | + "#### Operations on Factors\n", |
| 822 | + "\n", |
| 823 | + "We are interested in two kinds of operations on factors. **Pointwise Product** which is used to created joint distributions and **Summing Out** which is used for marginalization." |
| 824 | + ] |
| 825 | + }, |
| 826 | + { |
| 827 | + "cell_type": "code", |
| 828 | + "execution_count": null, |
| 829 | + "metadata": { |
| 830 | + "collapsed": true |
| 831 | + }, |
| 832 | + "outputs": [], |
| 833 | + "source": [ |
| 834 | + "%psource Factor.pointwise_product" |
| 835 | + ] |
| 836 | + }, |
| 837 | + { |
| 838 | + "cell_type": "markdown", |
| 839 | + "metadata": {}, |
| 840 | + "source": [ |
| 841 | + "**Factor.pointwise_product** implements a method of creating a joint via combining two factors. We take the union of **variables** of both the factors and then generate the **cpt** for the new factor using **all_events** function. Note that the given we have eliminated rows that are not consistent with the evidence. Pointwise product assigns new probabilities by multiplying rows similar to that in a database join." |
| 842 | + ] |
| 843 | + }, |
| 844 | + { |
| 845 | + "cell_type": "code", |
| 846 | + "execution_count": null, |
| 847 | + "metadata": { |
| 848 | + "collapsed": true |
| 849 | + }, |
| 850 | + "outputs": [], |
| 851 | + "source": [ |
| 852 | + "%psource pointwise_product" |
| 853 | + ] |
| 854 | + }, |
| 855 | + { |
| 856 | + "cell_type": "markdown", |
| 857 | + "metadata": {}, |
| 858 | + "source": [ |
| 859 | + "**pointwise_product** extends this operation to more than two operands where it is done sequentially in pairs of two." |
| 860 | + ] |
| 861 | + }, |
| 862 | + { |
| 863 | + "cell_type": "code", |
| 864 | + "execution_count": null, |
| 865 | + "metadata": { |
| 866 | + "collapsed": true |
| 867 | + }, |
| 868 | + "outputs": [], |
| 869 | + "source": [ |
| 870 | + "%psource Factor.sum_out" |
| 871 | + ] |
| 872 | + }, |
| 873 | + { |
| 874 | + "cell_type": "markdown", |
| 875 | + "metadata": {}, |
| 876 | + "source": [ |
| 877 | + "**Factor.sum_out** makes a factor eliminating a variable by summing over its values. Again **events_all** is used to generate combinations for the rest of the variables." |
| 878 | + ] |
| 879 | + }, |
| 880 | + { |
| 881 | + "cell_type": "code", |
| 882 | + "execution_count": null, |
| 883 | + "metadata": { |
| 884 | + "collapsed": true |
| 885 | + }, |
| 886 | + "outputs": [], |
| 887 | + "source": [ |
| 888 | + "%psource sum_out" |
| 889 | + ] |
| 890 | + }, |
| 891 | + { |
| 892 | + "cell_type": "markdown", |
| 893 | + "metadata": {}, |
| 894 | + "source": [ |
| 895 | + "**sum_out** uses both **Factor.sum_out** and **pointwise_product** to finally eliminate a particular variable from all factors by summing over its values." |
| 896 | + ] |
| 897 | + }, |
| 898 | + { |
| 899 | + "cell_type": "markdown", |
| 900 | + "metadata": {}, |
| 901 | + "source": [ |
| 902 | + "#### Elimination Ask\n", |
| 903 | + "\n", |
| 904 | + "The algorithm described in **Figure 14.11** of the book is implemented by the function **elimination_ask**. We use this for inference. The key idea is that we eliminate the hidden variables by interleaving joining and marginalization. It takes in 3 arguments **X** the query variable, **e** the evidence variable and **bn** the Bayes network. \n", |
| 905 | + "\n", |
| 906 | + "The algorithm creates factors out of Bayes Nodes in reverse order and eliminates hidden variables using **sum_out**. Finally it takes a point wise product of all factors and normalizes. Let us finally solve the problem of inferring \n", |
| 907 | + "\n", |
| 908 | + "**P(Burglary=True | JohnCalls=True, MaryCalls=True)** using variable elimination." |
| 909 | + ] |
| 910 | + }, |
| 911 | + { |
| 912 | + "cell_type": "code", |
| 913 | + "execution_count": null, |
| 914 | + "metadata": { |
| 915 | + "collapsed": true |
| 916 | + }, |
| 917 | + "outputs": [], |
| 918 | + "source": [ |
| 919 | + "%psource elimination_ask" |
| 920 | + ] |
| 921 | + }, |
| 922 | + { |
| 923 | + "cell_type": "code", |
| 924 | + "execution_count": null, |
| 925 | + "metadata": { |
| 926 | + "collapsed": false |
| 927 | + }, |
| 928 | + "outputs": [], |
| 929 | + "source": [ |
| 930 | + "elimination_ask('Burglary', dict(JohnCalls=True, MaryCalls=True), burglary).show_approx()" |
| 931 | + ] |
681 | 932 | }
|
682 | 933 | ],
|
683 | 934 | "metadata": {
|
|
0 commit comments