Skip to main content

probability or statistics - Problems encoding a Bayesian Network with just five nodes using ProbabilityDistribution


Question summary


I had recently asked this question where problems encoding a Bayesian Network were linked to the use of MultinomialDistribution. While that problem can be avoided using EmpiricalDistribution, there remains an issue with using ProbabilityDistribution for larger networks as it seems: While Probability can be used for inference with 4 nodes, it will not evaluate for the "full" example network of 5 nodes -- which still is far removed from real application demands. Why is this so? What can be done about it?


Bayesian Network Example


Again I would like to use the (simple) example that is given on page 53 in Probabilistic Graphical Models (2009), by Daphne Koller and Neir Friedman:


BayesianNetwork



The network has five nodes (random variables):



  • Difficulty of a class taken by a student (0 = easy, 1 = hard)

  • Intelligence of the student (0 = low, 1 = high)

  • Grade achieved by the student (1 = A, 2 = B, 3 = C)

  • SAT score of the student (0 = low, 1 = high)

  • Letter of recommendation by the teacher (0 = False, 1 = True)


We would like to use this network to do probabilistic inference (causal or evidential) like: "What is the probability of the student achieving an A, given that he is intelligent?"


Encoding the Bayesian Network in Mathematica



Essentially the Bayesian Network is a sparse way to define the joint probability distribution function for the random variables using the chain rule of probability theory:


P(I,D,G,S,L)=P(I)×P(D)×P(G|I,D)×P(S|I)×P(L|G)


I am encoding this in Mathematica as follows:


(* nodes without parents *)
distI = BernoulliDistribution[ 0.3 ]; (* prior probability of high intelligence *)
distD = BernoulliDistribution[ 0.4 ]; (* prior probability of hard class *)

(* nodes with parents = conditional probability distributions *)
(* conditional distribution of the grade *)
cpdG = Function[ { i, d },

With[
{
p = Piecewise[
{
{ { 0.3, 0.4, 0.3 }, i == 0 && d == 0 },
{ { 0.05, 0.25, 0.7 }, i == 0 && d == 1 },
{ { 0.9, 0.08, 0.02 }, i == 1 && d == 0 },
{ { 0.5, 0.3, 0.2 }, i == 1 && d == 1 }
}
]

},
EmpiricalDistribution[ p -> Range[3] ]
]
];

(* conditional distribution for the SAT score *)
cpdS = Function[ i,
With[
{
θ = Piecewise[

{
{ 0.05, i == 0 },
{ 0.8, i == 1 }
}
] (* probability of a high SAT score *)
},
BernoulliDistribution[θ]
]
];


(* conditional probability function for the Letter *)
cpdL = Function[ g,
With[
{
θ = Piecewise[
{
{ 0.9, g == 1 },
{ 0.6, g == 2 },
{ 0.01, g == 3 }
}

]
},
BernoulliDistribution[θ]
]
];

(* BayesNetwork = Joint Probability Distribution Function *)
(* B4 = P(I,D,G,L) *)
distB4 = ProbabilityDistribution[
PDF[ distI, i] PDF[ distD, d] PDF[ cpdG[i,d], g] PDF[ cpdL[g], l],

{i, 0, 1, 1},
{d, 0, 1, 1},
{g, 1, 3, 1},
{l, 0, 1, 1}
];

(* B5 = P(I,D,G,S,L) *)
distB5 = ProbabilityDistribution[
PDF[ distI, i] PDF[ distD, d] PDF[ cpdG[i,d], g] PDF[ cpdS[i], s] PDF[ cpdL[g], l],
{i, 0, 1, 1},

{d, 0, 1, 1},
{g, 1, 3, 1},
{s, 0, 1, 1},
{l, 0, 1, 1}
];

Doing Inference


Now we would like to ask the question as stated above:


Probability[ g == 1 \[Conditioned] i == 1, {i,d,g,l} \[Distributed] distB4 ]



0.74



Probability[ g == 1 \[Conditioned] i == 1, {i,d,g,s,l} \[Distributed] distB5 ]


Probability[ ] is returned unevaluted.



Why is this the case? What can be done about it - after all 5 nodes should not be too far a stretch?



Answer





As indicated in the answer given by WRI here, the interplay of Piecewise and ProbabilityDistribution is tricky and -- so my temporary verdict -- is best avoided.


Indeed, using indicator functions, e.g. Boole, as a replacement for Piecewise solves the issue:


(* nodes without parents remain unchanged *)
(* CPDs are redefined using Boole instead of Piecewise *)

(* conditional distribution of the grade *)
cpdG = Function[ {i,d},
With[
{

p = Plus[
{0.3 , 0.4 , 0.3 } Boole[ i == 0 && d == 0 ],
{0.05, 0.25, 0.7 } Boole[ i == 0 && d == 1 ],
{0.9 , 0.08, 0.02} Boole[ i == 1 && d == 0 ],
{0.5 , 0.3 , 0.2 } Boole[ i == 1 && d == 1 ]
]
},
EmpiricalDistribution[ p -> Range[3] ]
]
];


(*conditional distribution for the SAT score*)
cpdS = Function[ i,
With[
{
θ = Plus[
0.05 Boole[ i == 0 ],
0.8 Boole[ i == 1 ]
] (*probability of a high SAT score*)
},

BernoulliDistribution[θ]
]
];

(*conditional probability function for the Letter*)
cpdL = Function[ g,
With[
{
θ = Plus[
0.9 Boole[ g == 1 ],

0.6 Boole[ g == 2 ],
0.01 Boole[ g == 3 ]
]
},
BernoulliDistribution[θ]
]
];

(* B5 = P(I,D,G,S,L) complete BN *)
distB5 = ProbabilityDistribution[

PDF[ distI, i] PDF[ distD, d] PDF[ cpdG[i,d], g] PDF[ cpdS[i], s] PDF[ cpdL[g], l],
{i, 0, 1, 1}, {d, 0, 1, 1}, {g, 1, 3, 1}, {s, 0, 1, 1}, {l, 0, 1, 1}
];

Now doing inference for the complete joint probability distribution as specified by the Bayesian Network works out fine:


Probability[ g == 1\[Conditioned] i == 1, {i,d,g,s,l} \[Distributed] distB5 ]


0.74




Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

mathematical optimization - Minimizing using indices, error: Part::pkspec1: The expression cannot be used as a part specification

I want to use Minimize where the variables to minimize are indices pointing into an array. Here a MWE that hopefully shows what my problem is. vars = u@# & /@ Range[3]; cons = Flatten@ { Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; Minimize[{Total@((vec1[[#]] - vec2[[u[#]]])^2 & /@ Range[1, 3]), cons}, vars, Integers] The error I get: Part::pkspec1: The expression u[1] cannot be used as a part specification. >> Answer Ok, it seems that one can get around Mathematica trying to evaluate vec2[[u[1]]] too early by using the function Indexed[vec2,u[1]] . The working MWE would then look like the following: vars = u@# & /@ Range[3]; cons = Flatten@{ Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; NMinimize[ {Total@((vec1[[#]] - Indexed[vec2, u[#]])^2 & /@ R...

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]