Reading Teh’s Slides on Dirichlet Process

Gaussian distribution: Each draw G from a GP is a function over the real axis. Any finite marginalization of G is Gaussian distributed.

Finite mixture model: \pi ~ Dir(\alpha/K, …) — \alpha is the total number of observables in prior, /K distribute them evenly into K slots, 1/K, … is the base distribution.

Agglomerative property of Dirichlet distributions: imaging the visualization of slots — merging two slots sums their areas.

Decimative property of Dirichlet distributions: dividing a slot by (\beta, 1-\beta), the area is divided by the same ratio.

Conjugacy between Dirichlet and Multinomial: \pi ~ Dir(\alpha), z ~ Disc(\pi), then z ~ Disc(….), why? (P38/80)