Multi-way set enumeration in weight tensors

Institute Homepage

Institute Homepage Sign In

Back

Empirical Inference Article 2011

Director

The analysis of n-ary relations receives attention in many different fields, for instance biology, web mining, and social studies. In the basic setting, there are n sets of instances, and each observation associates n instances, one from each set. A common approach to explore these n-way data is the search for n-set patterns, the n-way equivalent of itemsets. More precisely, an n-set pattern consists of specific subsets of the n instance sets such that all possible associations between the corresponding instances are observed in the data. In contrast, traditional itemset mining approaches consider only two-way data, namely items versus transactions. The n-set patterns provide a higher-level view of the data, revealing associative relationships between groups of instances. Here, we generalize this approach in two respects. First, we tolerate missing observations to a certain degree, that means we are also interested in n-sets where most (although not all) of the possible associations have been recorded in the data. Second, we take association weights into account. In fact, we propose a method to enumerate all n-sets that satisfy a minimum threshold with respect to the average association weight. Technically, we solve the enumeration task using a reverse search strategy, which allows for effective pruning of the search space. In addition, our algorithm provides a ranking of the solutions and can consider further constraints. We show experimental results on artificial and real-world datasets from different domains.

Author(s):	Georgii, E. and Tsuda, K. and Schölkopf, B.
Journal:	Machine Learning
Volume:	82
Number (issue):	2
Pages:	123-155
Year:	2011
Month:	February
Day:	0

Bibtex Type:	Article (article)

DOI:	10.1007/s10994-010-5210-y

Digital:	0
Electronic Archiving:	grant_archive
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	PDF

BibTex

@article{6848,
  title = {Multi-way set enumeration in weight tensors},
  journal = {Machine Learning},
  abstract = {The analysis of n-ary relations receives attention in many different fields, for instance biology, web mining, and social studies. In the basic setting, there are n sets of instances, and each observation associates n instances, one from each set. A common approach to explore these n-way data is the search for n-set patterns, the n-way equivalent of itemsets. More precisely, an n-set pattern consists of specific subsets of the n instance sets such that all possible associations between the corresponding instances are observed in the data. In contrast, traditional itemset mining approaches consider only two-way data, namely items versus transactions. The n-set patterns provide a higher-level view of the data, revealing associative relationships between groups of instances. Here, we generalize this approach in two respects. First, we tolerate missing observations to a certain degree, that means we are also interested in n-sets where most (although not all) of the possible associations have been recorded in the
  data. Second, we take association weights into account. In fact, we propose a method to enumerate all n-sets that satisfy a minimum threshold with respect to the average association weight. Technically, we solve the enumeration task using a reverse search strategy, which allows for effective pruning of the search space. In addition, our algorithm provides a ranking of the solutions and can consider further constraints. We show experimental results on artificial and real-world datasets from different domains.},
  volume = {82},
  number = {2},
  pages = {123-155},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = feb,
  year = {2011},
  slug = {6848},
  author = {Georgii, E. and Tsuda, K. and Sch{\"o}lkopf, B.},
  month_numeric = {2}
}