# Graphical Models: D-Separation

** 5 min read** • Published: **November 29, 2018**

This article is a brief overview of conditional independence in graphical models, and the related d-separation. Let us begin with a definition.

For three random variables $X$, $Y$ and $Z$, we say $X$ is conditionally independent of $Y$ given $Z$ iff

$p(X, Y | Z) = p(X | Z) p(Y | Z).$We can use a shorthand notation

$X \bigci Y | Z$Before we can define d-separation, let us first show three different types of graphs. Consider the same three variables as before, we’ll be interested in conditional independence based on whether we observe $Z$.

## Tail-tail

The first case is called the *tail-tail*.

We can factor the joint distribution to get

$p(X, Y, Z) = p(X | Z) p(Y | Z) p(Z)$and conditioning on the value of $Z$ we get (using the Bayes’ theorem)

$p(X, Y | Z) = \frac{p(X, Y, Z)}{p(Z)} = \frac{p(X | Z) p(Y | Z) p(Z)}{p(Z)} = p(X | Z) p(Y | Z).$From this we can immediately see that conditioning on $Z$ in the *tail-tail* case makes $X$ and $Y$ independent, that is $X \bigci Y | Z$.

## Head-tail

The second case is called the *head-tail* and looks as the following.

We can again write the joint distribution for the graph

$p(X, Y, Z) = p(X) p(Z | X) p(Y | Z)$and again conditioning on $Z$ we get (using rules of conditional probability)

$\begin{aligned} p(X, Y | Z) &= \frac{p(X, Y, Z)}{p(Z)} \\ &= \frac{p(X) p(Z | X) p(Y | Z)}{p(Z)} \\ &= \frac{p(X, Z) p(Y | Z)}{p(Z)} \\ &= \frac{p(X | Z) p(Z) p(Y | Z)}{p(Z)} \\ &= p(X | Z) p(Y | Z) \end{aligned}$and so again, $X$ and $Y$ are conditionally independent given $Z$, that is $X \bigci Y | Z$.

#### Checking marginal independence

For completeness, we can also check if $X$ and $Y$ are marginally independent, which they shouldn’t be, since we just showed they’re conditionally independent.

$p(X, Y, Z) = p(X) p(Z | X) p(Y | Z)$which gives us the following when marginalizing over $Z$

$p(X, Y) = \sum_Z p(X, Y, Z) = p(X) \sum_Z p(Z | X) p(Y | Z) = p(X) \sum_Z p(Y, Z | X) = p(X) p(Y | X)$from which we can immediately see it does not factorize into $p(X) p(Y)$ in the general case, and thus $X$ and $Y$ are not marginally independent.

## Head-head

The last case is called the *head-head* and is a little bit tricky

We can again write out the joint distribution

$p(X, Y, Z) = p(X) p(Y) p(Z | X, Y),$but this does not immediately help us when we try to condition on $Z$, we would want

$p(X, Y | Z) = \frac{p(X, Y, Z)}{p(Z)} \stackrel{?}{=} p(X|Z) p(Y|Z)$which does not hold in general. For example, consider $X, Y \sim Bernoulli(0.5)$ and $Z = 1$ if $X = Y$, and $0$ otherwise. In this case if we know $Z$ and observe $X$, it immediately tells us the value of $Y$, hence $X$ and $Y$ are not conditionally independent given $Z$.

We can however do a little trick and write the $p(X, Y)$ as a marginalization over $Z$, that is

$p(X, Y) = \sum_Z p(X, Y, Z) = \sum_Z p(X) p(Y) p(Z | X, Y) = p(X) p(Y)$since $\sum_Z p(Z | X, Y) = 1$. As a result, in the head-head case we have marginal independence between $X$ and $Y$, that is $X \bigci Y$.

## D-separation

Having shown the three cases, we can finally define d-separation. Let $G$ be a DAG, and let $A, B, C$ be disjoint subsets of vertices.

A path between two vertices is **blocked** if it passes through a vertex $v$, such that either:

- the edges are head-tail or tail-tail, and $v \in C$, or
- the edges are head-head, and $v \not \in C$, and neither are any of its descendants.

We say that $A$ and $B$ are **d-separated** by $C$ if all paths from a vertex of $A$ to a vertex of $B$ are blocked w.r.t. $C$. And now comes the important part, **if $A$ and $B$ are d-separated by $C$, then $A \bigci B\ |\ C$**.

Thig might all look very complicated, but this property of directed graphical models is actually extremely useful, and very easy to do quickly after seeing just a few examples.

## Examples

To get a feel for d-separation, let us look at the following example ($B$ is observed).

We can immediately see that $A \bigci D | B$ since this is the *head-tail* case. We can also see that $A \not{\bigci} E | B$ (not conditionally independent), because while the path through $B$ is blocked, the path through $C$ is not.

### Share on Twitter and Facebook

### Discussion of "Graphical Models: D-Separation"

If you have any questions, feedback, or suggestions, please do share them in the comments! I'll try to answer each and every one. If something in the article wasn't clear don't be afraid to mention it. The goal of these articles is to be as informative as possible.

If you'd prefer to reach out to me via email, my address is loading ..