Processing math: 0%

Thursday, January 7, 2010

More Research

I mentioned last time that I would finish up with DCCA. It is straightforward given the intuition last time, that basically we want something similar to LDA, which maximizes
\max_w \frac{w^T S_b w}{w^T S_w w}, where S_b is the covariance matrix of the between-class examples and S_w is that of the within-class examples. Note that this finds a linear transformation, and so is called a linear discriminant.
On the other hand DCCA is concerned with correlations, so we attempt to find the pair of transformations which maximize the within class correlation while minimizing the between class correlation. Again, let X and Y be the two sources of information, which labels indexed by c \in \{1, 2, ..., K\}. And so we want
\max_{a,b} \frac{a^TC_{xy}b}{(a^T S_{xx} a b^T S_{yy} b)^{1/2}},
where C_{xy} = C_w - C_b, and
C_w = \sum_c \sum_{\{i|x_i \in c\}} \sum_{\{j|\{y_j \in c\}} x_iy_j^T
and
C_b = \sum_{c_1} \sum_{c_2 \neq c_1} \sum_{\{i|x_i \in c_1\}} \sum_{\{j|y_j \in c_2\}} x_iy_j^T.
It is easy to see that here we are maximizing over the difference in the canonical correlation of the within class and between class canonical correlation.
I'm not going to get into the optimization, mostly because it's lots of specially crafted matrices, which are pretty much impossible to format in blogger, but you can go to the reading I cribbed this from to get the details (below)
Based on readings:
  1. Sun, T. K., et al. "Kernelized Discriminative Canonical Correlation Analysis." 2007. Vol. 3.

No comments: