Gaussian distribution: Each draw G from a GP is a function over the real axis. Any finite marginalization of G is Gaussian distributed.
Finite mixture model: \pi ~ Dir(\alpha/K, …) — \alpha is the total number of observables in prior, /K distribute them evenly into K slots, 1/K, … is the base distribution.
Agglomerative property of Dirichlet distributions: imaging the visualization of slots — merging two slots sums their areas.
Decimative property of Dirichlet distributions: dividing a slot by (\beta, 1-\beta), the area is divided by the same ratio.
Conjugacy between Dirichlet and Multinomial: \pi ~ Dir(\alpha), z ~ Disc(\pi), then z ~ Disc(….), why? (P38/80)