# Regularizations for General Linear Regression

The lasso [Tibshirani, 1996] refers to L1-regularization of linear models:

$\Omega_\lambda(\vec{w}) = \lambda \sum_{i=1}^D |w_i|$

In group lasso [Yuan and Lin 2006], the D predictors are divided into G groups, and the g-th group has $D_g$ predictors. The regularization term becomes:

$\Omega_\lambda(\vec{w}) = \lambda \sum_{g=1}^G \sqrt{D_g} ||\vec{w}_g||_2$,

where $||\cdot||_2$ denotes the L2-norm: $||\vec{x}||_2 = \sqrt{\sum x_i^2}$.  So, when all $D_g$‘s are 1, group lasso becomes lasso.

A further extension to group lasso is sparse group lasso:

$\Omega_{\lambda,r}(\vec{w}) =\lambda \sum_{l=1}^G \sqrt{D_g}||\vec{w}_g||_2 + r_g |\vec{w}_g|$

But in practice, we may not need this very complex regularization.