Saturday, February 6, 2010

Track of The Day

♫ The Soft Pack - "Answer To Yourself" [MP3]

Track of Yesterday

♫ The Album Leaf - "Falling From the Sun" [MP3]

Missed yesterday, so this one is really for Friday. I like it when Jimmy sings.

Research

Wow, you'd think I wasn't getting any research done, but actually that's not quite the truth. I've packaged up a lot of things, tying up some loose ends. Here I'm going to describe a little about some work I've done with a distribution that I like I lot, called the Generalized Normal distribution, or Generalized Gaussian distribution.

Basically, the idea is that  $p(x) \propto \exp(-(\gamma(|x - \mu|)^\beta)$, so that with $\gamma = \frac{1}{2}\sigma^{-2}$, and $\beta = 2$, this is the Gaussian, and with $\beta=1$, we have a Laplace, or Exponential, depending on $\mu$.

The mean here is $\mu$ which is very nice, but the efficiency of the mean estimator (the average)  is quite different with different power. An intuitive understanding of what the generalization entails (what $\beta$ "means") is that the tail behavior changes. As $\beta$ approaches 0, we get heavier and heavier tails, and so intuitively, it becomes "harder" to estimate the mean. In fact, with $\mu = 0$, $\gamma = 1$, with $\beta$ less than about $1.4$, asymptotic variance of the median estimator is less than that of the mean estimator.

Weakening the assumption of exponential decay in the tails gives us a knob to twiddle in applications where we have might have a Gaussian prior, which is the most common prior for linear models. As is well known with the Exponential ($L_1$ regularization) prior, which are heavily peaked, if we assume mean 0, we can get sparse models. Moreover, we can show formally that effect of noise in the data will be limited, especially if we are interested in estimating median effects.

However, the Generalized Gaussian gives a nice way of managing the "peakiness" of the prior distribution, by allowing for different $\beta$ parameters to model the trade off between what we believe about how sparse the model is before seeing any data, and how much sparsity we actually desire.

This is just a short note about the prior, which I believe gives a nice way to go about parameterizing a prior for a linear model when what we care about is the smallest possible model, which is what we usually want.

This was just a short note about my favorite distributions.


Friday, February 5, 2010

Perfect Valentine's Day Card

Searching for the perfect Valentine's day sentiment? Make it stank this year.

Pretty good, but I was thinking "I love it when you call me Big Poppa."