What is the difference between a (dynamic) Bayes network and a HMM?



I have read that HMMs, Particle Filters and Kalman filters are special cases of dynamic Bayes networks. However, I only know HMMs and I don't see the difference to dynamic Bayes networks.

Could somebody please explain?

It would be nice if your answer could be similar to the following, but for bayes Networks:

Hidden Markov Models

A Hidden Markov Model (HMM) is a 5-tuple $\lambda = (S, O, A, B, \Pi)$:

  • $S \neq \emptyset$: A set of states (e.g. "beginning of phoneme", "middle of phoneme", "end of phoneme")
  • $O \neq \emptyset$: A set of possible observations (audio signals)
  • $A \in \mathbb{R}^{|S| \times |S|}$: A stochastic matrix which gives probabilites $(a_{ij})$ to get from state $i$ to state $j$.
  • $B \in \mathbb{R}^{|S| \times |O|}$: A stochastic matrix which gives probabilites $(b_{kl})$ to get in state $k$ the observation $l$.
  • $\Pi \in \mathbb{R}^{|S|}$: Initial distribution to start in one of the states.

It is usually displayed as a directed graph, where each node corresponds to one state $s \in S$ and the transition probabilities are denoted on the edges.

Hidden Markov Models are called "hidden", because the current state is hidden. The algorithms have to guess it from the observations and the model itself. They are called "Markov", because for the next state only the current state matters.

For HMMs, you give a fixed topology (number of states, possible edges). Then there are 3 possible tasks

  • Evaluation: given a HMM $\lambda$, how likely is it to get observations $o_1, \dots, o_t$ (Forward algorithm)
  • Decoding: given a HMM $\lambda$ and a observations $o_1, \dots, o_t$, what is the most likely sequence of states $s_1, \dots, s_t$ (Viterbi algorithm)
  • Learning: learn $A, B, \Pi$: Baum-Welch algorithm, which is a special case of Expectation maximization.

Bayes networks

Bayes networks are directed acyclical graphs (DAGs) $G = (\mathcal{X}, \mathcal{E})$. The nodes represent random variables $X \in \mathcal{X}$. For every $X$, there is a probability distribution which is conditioned on the parents of $X$:


There seem to be (please clarify) two tasks:

  • Inference: Given some variables, get the most likely values of the others variables. Exact inference is NP-hard. Approximately, you can use MCMC.
  • Learning: How you learn those distributions depends on the exact problem (source):

    • known structure, fully observable: maximum likelihood estimation (MLE)
    • known structure, partially observable: Expectation Maximization (EM) or Markov Chain Monte Carlo (MCMC)
    • unknown structure, fully observable: search through model space
    • unknown structure, partially observable: EM + search through model space

Dynamic Bayes networks

I guess dynamic Bayes networks (DBNs) are also directed probabilistic graphical models. The variability seems to come from the network changing over time. However, it seems to me that this is equivalent to only copying the same network and connecting every node at time $t$ with every the corresponding node at time $t+1$. Is that the case?

Martin Thoma

Posted 2016-01-27T19:58:00.083

Reputation: 15 590

I asked someone about this and they said: "HMMs are just special cases of dynamic Bayes nets, with each time slice containing one latent variable, dependent on the previous one to give a Markov chain, and one observation dependent on each latent variable. DBNs can have any structure that evolves over time." – ashley – 2017-11-27T22:12:34.983


  • You can also learn the topology of an HMM.
  • When doing inference with BNs, besides asking for maximum likelihood estimates, you can also sample from the distributions, estimate the probabilities, or do whatever else probability theory lets you.
  • A DBN is just a BN copied over time, with some (not necessarily all) nodes chained from past to the future.
  • In this sense, a HMM is a simple DBN with just two nodes in each time-slice and one of the nodes chained over time. – KT. – 2016-02-03T09:25:34.987



    From a similar Cross Validation question follows @jerad answer:

    HMMs are not equivalent to DBNs, rather they are a special case of DBNs in which the entire state of the world is represented by a single hidden state variable. Other models within the DBN framework generalize the basic HMM, allowing for more hidden state variables (see the second paper above for the many varieties).

    Finally, no, DBNs are not always discrete. For example, linear Gaussian state models (Kalman Filters) can be conceived of as continuous valued HMMs, often used to track objects in space.

    I'd recommend looking through these two excellent review papers:


    Posted 2016-01-27T19:58:00.083

    Reputation: 338