In my previous post, I noted the general derivation process of confidence interval of Gaussian parameters. This post specifically notes the proof of joint probability distribution of Gaussian parameters at p.29 of MIT Statistics lecture notes no.5.
An orthogonal transformation is a square matrix, whose each row is a basis vector with length 1. As basis vectors are perpendicular to each other, when we multiply an orthogonal transformation with a vector x, x is projects to each basis vector.
This just re-represent x in another axis system (defined by rows of the transformation); between the new and the old axis systems share the same original point, orthogonal transformation does not change the length of x. This property is used in (5.0.2).
Imaging that we have a set of points (vectors) sampled from a Gaussian distribution with diagonal covariance matrix. Since such a distribution has sphere contour, orthogonal transformation of the contour (or equivalently, the sample points) does not change the shape of the sphere contour. In other words, the transformed random variable (or the transformed Gaussian distribution) is still a Gaussian with diagonal covariance matrix, whose variates are independent to each other. This property is used in the last paragraph of the proof.