Empirical Inference Technical Report 2004

Multivariate Regression with Stiefel Constraints

We introduce a new framework for regression between multi-dimensional spaces. Standard methods for solving this problem typically reduce the problem to one-dimensional regression by choosing features in the input and/or output spaces. These methods, which include PLS (partial least squares), KDE (kernel dependency estimation), and PCR (principal component regression), select features based on different a-priori judgments as to their relevance. Moreover, loss function and constraints are chosen not primarily on statistical grounds, but to simplify the resulting optimisation. By contrast, in our approach the feature construction and the regression estimation are performed jointly, directly minimizing a loss function that we specify, subject to a rank constraint. A major advantage of this approach is that the loss is no longer chosen according to the algorithmic requirements, but can be tailored to the characteristics of the task at hand; the features will then be optimal with respect to this objective. Our approach also allows for the possibility of using a regularizer in the optimization. Finally, by processing the observations sequentially, our algorithm is able to work on large scale problems.

Author(s): Bakir, GH. and Gretton, A. and Franz, MO. and Schölkopf, B.
Number (issue): 128
Year: 2004
Day: 0
Bibtex Type: Technical Report (techreport)
Digital: 0
Electronic Archiving: grant_archive
Institution: MPI for Biological Cybernetics, Spemannstr 38, 72076, Tuebingen
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@techreport{2831,
  title = {Multivariate Regression with Stiefel Constraints},
  abstract = {We introduce a new framework for regression between multi-dimensional spaces. Standard
  methods for solving this problem typically reduce the problem to one-dimensional
  regression by choosing  features in the input and/or output spaces. These methods, which
  include PLS (partial least squares), KDE (kernel dependency estimation), and PCR
  (principal component regression), select features based on different a-priori judgments as
  to their relevance. Moreover, loss function and constraints are chosen not primarily on
  statistical grounds, but to simplify the resulting optimisation. By contrast, in our
  approach the feature construction and the regression estimation are performed jointly,
  directly minimizing a loss function that we specify, subject to a rank constraint. A
  major advantage of this approach is that the loss is no longer chosen according to the
  algorithmic requirements, but can be tailored to the characteristics of the task at hand;
  the features will then be optimal with respect to this objective. Our approach also
  allows for the possibility of using a regularizer in the optimization. Finally, by processing the observations sequentially, our algorithm is able to work on large scale problems.},
  number = {128},
  organization = {Max-Planck-Gesellschaft},
  institution = {MPI for Biological Cybernetics, Spemannstr 38, 72076, Tuebingen},
  school = {Biologische Kybernetik},
  year = {2004},
  slug = {2831},
  author = {Bakir, GH. and Gretton, A. and Franz, MO. and Sch{\"o}lkopf, B.}
}