## What is the relationship between Markov Random Fields and Conditional Random Fields?

4

1

In Neural networks [3.8] : Conditional random fields - Markov network by Hugo Larochelle it seems to me that a Markov Random Field is a special case of a CRF.

However, in the Wikipedia article Markov random field it says:

One notable variant of a Markov random field is a conditional random field, in which each random variable may also be conditioned upon a set of global observations o.

This would mean that CRFs are a special case of MRFs.

## Definitions

### Markov Random Field

Again, according to Wikipedia

Given an undirected graph $G=(V,E)$, a set of random variables $X = (X_v)_{v\in V}$ indexed by $V$ form a Markov random field with respect to $G$ if they satisfy the local Markov properties:

Pairwise Markov property: Any two non-adjacent variables are conditionally independent given all other variables: $X_u \perp\!\!\!\perp X_v \mid X_{V \setminus \{u,v\}} \quad \text{if } \{u,v\} \notin E$

Local Markov property: A variable is conditionally independent of all other variables given its neighbors: $X_v \perp\!\!\!\perp X_{V\setminus \operatorname{cl}(v)} \mid X_{\operatorname{ne}(v)}$ where ${\textstyle \operatorname{ne}(v)}$ is the set of neighbors of $v$, and $\operatorname{cl}(v) = v \cup \operatorname{ne}(v)$ is the closed neighbourhood of $v$.

Global Markov property: Any two subsets of variables are conditionally independent given a separating subset: $X_A \perp\!\!\!\perp X_B \mid X_S$ where every path from a node in $A$ to a node in $B$ passes through $S$.

Please note if you know a citable source which gives a good definition.

### Conditional Random Fields

According to Wikipedia:

Lafferty, McCallum and Pereira define a CRF on observations $\boldsymbol{X}$ and random variables $\boldsymbol{Y}$ as follows:

Let $G = (V , E)$ be a graph such that

$\boldsymbol{Y} = (\boldsymbol{Y}_v)_{v\in V}$, so that $\boldsymbol{Y}$ is indexed by the vertices of $G$. Then $(\boldsymbol{X}, \boldsymbol{Y})$ is a conditional random field when the random variables $\boldsymbol{Y}_v$, conditioned on $\boldsymbol{X}$, obey the Markov property with respect to the graph: $$p(\boldsymbol{Y}_v |\boldsymbol{X}, \boldsymbol{Y}_w, w \neq v) = p(\boldsymbol{Y}_v |\boldsymbol{X}, \boldsymbol{Y}_w, w \sim v)$$ where $\mathit{w} \sim v$ means that $w$ and $v$ are neighbors in G.

What this means is that a CRF is an undirected graphical model whose nodes can be divided into exactly two disjoint sets $\boldsymbol{X}$ and $\boldsymbol{Y}$, the observed and output variables, respectively; the conditional distribution $p(\boldsymbol{Y}|\boldsymbol{X})$ is then modeled.

## Question

What is the relationship between Markov Random Fields and Conditional Random Fields?

– Martin Thoma – 2016-01-08T21:52:14.980

2

Conditinal Random Fields (CRFs) are a special case of Markov Random Fields (MRFs).

1.5.4 Conditional Random Field

A Conditional Random Field (CRF) is a form of MRF that defines a posterior for variables x given data z, as with the hidden MRF above. Unlike the hidden MRF, however, the factorization into the data distribution P (x|z) and the prior P (x) is not made explicit . This allows complex dependencies of x on z to be written directly in the posterior distribution, without the factorization being made explicit. (Given P (x|z), such factorizations always exist, however—infinitely many of them, in fact—so there is no suggestion that the CRF is more general than the hidden MRF, only that it may be more convenient to deal with.)

Source: Blake, Kohli and Rother: Markov random fields for vision and image processing. 2011.

A conditional random field or CRF (Lafferty et al. 2001), sometimes a discriminative random field (Kumar and Hebert 2003), is just a version of an MRF where all the clique potentials are conditioned on input features: [...]

The advantage of a CRF over an MRF is analogous to the advantage of a discriminative classifier over a generative classifier (see Section 8.6), namely, we don’t need to “waste resources” modeling things that we always observe. [...]

The disadvantage of CRFs over MRFs is that they require labeled training data, and they are slower to train[...]

Source: Kevin P. Murphy: Machine Learning: A Probabilistic Perspective