Learning Ranking Using Probit Regression

This SIGIR 2006 poster shows that the learning of a ranking function, given pairwise preferences over a known set of document ids, can be formalized as a specialized probit regression problem.  The idea comes from this 2003 paper, which in addition explained how to solve this specialized probit regression problem using existing statistical software.  (Though, in the case of Web page ranking, I guess no exiting statistical software can be employed.)

For each query, the relevance of document i is denoted by \theta_i, and given some pairs of

y_t = \begin{cases} 1 & d_{k,1} \geq d_{k,2} \\ 0 & d_{k,1} < d_{k,2} \end{cases}

We can estimate \theta_i's by maximizing the likelihood function

L(\Theta)=\prod_{k=1}^K P(\theta_{k,1} \geq \theta_{k,2})^{y_k} \left[1-P(\theta_{k,1}\geq \theta_{k,2})\right]^{1-y_i}

If we assume that the relevance weights are distributed normally with mean \theta_i and variance \frac{1}{2}, then the cumulative density function P(\theta_{k,1}\geq\theta_{k,2}) becomes \Phi(\theta_{k,1}-\theta_{k,2}), the normal cumulative distribution, and the likelihood becomes:

L(\Theta)=\prod_{k=1}^K \Phi(\theta_{k,1}-\theta_{k,2})^{y_k} \left[1-\Phi(\theta_{k,1}-\theta_{k,2})\right]^{1-y_i}

However, my point is that this method is hard to implement in search engine, as it makes use of document ids and clicks on documents, there exist a huge number of parameters (equals to the number of query-document pairs), and relatively a much much less training data set.