This question concerns models that accept feature vectors which are purely integers: in what ways do e.g. count vectors screw up models which are traditionally defined on the reals?
Collins et. al. 2002 generalized Principal Component Analysis to non-Gaussian noise models. In the article they go on to briefly state that "[t]he Poisson is better suited to integer data, and the Bernoulli to binary data", but nowhere have I found a deeper explanation for this.
It sounds perfectly reasonable, but I'm having trouble getting what this has to do with PCA since it uses a least squares loss function, which in my mind minimizes the reconstruction error regardless of the domain of your input.
Any insights would be greatly appreciated!
[link][2 comments]