Empirical Inference Conference Paper 2011

On the discardability of data in Support Vector Classification problems

We analyze the problem of data sets reduction for support vector classification. The work is also motivated by distributed problems, where sensors collect binary measurements at different locations moving inside an environment that needs to be divided into a collection of regions labeled in two different ways. The scope is to let each agent retain and exchange only those measurements that are mostly informative for the collective reconstruction of the decision boundary. For the case of separable classes, we provide the exact conditions and an efficient algorithm to determine if an element in the training set can become a support vector when new data arrive. The analysis is then extended to the non-separable case deriving a sufficient discardability condition and a general data selection scheme for classification. Numerical experiments relative to the distributed problem show that the proposed procedure allows the agents to exchange a small amount of the collected data to obtain a highly predictive decision boundary.

Author(s): Del Favero, S. and Varagnolo, D. and Dinuzzo, F. and Schenato, L. and Pillonetto, G.
Pages: 3210-3215
Year: 2011
Month: December
Day: 0
Publisher: IEEE
Bibtex Type: Conference Paper (inproceedings)
Address: Piscataway, NJ, USA
DOI: 10.1109/CDC.2011.6160607
Event Name: 50th IEEE Conference on Decision and Control and European Control Conference (CDC - ECC 2011)
Event Place: Orlando, FL, USA
Digital: 0
Electronic Archiving: grant_archive
ISBN: 978-1-61284-800-6
Links:

BibTex

@inproceedings{DelFaveroVDSP2011,
  title = {On the discardability of data in Support Vector Classification problems},
  abstract = {We analyze the problem of data sets reduction for support vector classification. The work is also motivated by distributed problems, where sensors collect binary measurements at different locations moving inside an environment that needs to be divided into a collection of regions labeled in two different ways. The scope is to let each agent retain and exchange only those measurements that are mostly informative for the collective reconstruction of the decision boundary. For the case of separable classes, we provide the exact conditions and an efficient algorithm to determine if an element in the training set can become a support vector when new data arrive. The analysis is then extended to the non-separable case deriving a sufficient discardability condition and a general data selection 
  scheme for classification. Numerical experiments relative to the distributed problem show that the proposed procedure allows the agents to exchange a small amount of the collected data to obtain a highly predictive decision boundary.},
  pages = {3210-3215},
  publisher = {IEEE},
  address = {Piscataway, NJ, USA},
  month = dec,
  year = {2011},
  slug = {delfaverovdsp2011},
  author = {Del Favero, S. and Varagnolo, D. and Dinuzzo, F. and Schenato, L. and Pillonetto, G.},
  month_numeric = {12}
}