The analysis of n-ary relations receives attention in many different fields, for instance biology, web mining, and social studies. In the basic setting, there are n sets of instances, and each observation associates n instances, one from each set. A common approach to explore these n-way data is the search for n-set patterns. An n-set pattern consists of specific subsets of the n instance sets such that all possible n- ary associations between the corresponding instances are observed. This provides a higher-level view of the data, revealing associative relationships between groups of instances. Here, we generalize this approach in two respects. First, we tolerate missing observations to a certain degree, that means we are also interested in n-sets where most (although not all) of the possible combinations have been recorded in the data. Second, we take association weights into account. More precisely, we propose a method to enumerate all n- sets that satisfy a minimum threshold with respect to the average association weight. Non-observed associations obtain by default a weight of zero. Technically, we solve the enumeration task using a reverse search strategy, which allows for effective pruning of the search space. In addition, our algorithm provides a ranking of the solutions and can consider further constraints. We show experimental results on artificial and real-world data sets from different domains.
Author(s): | Georgii, E. and Tsuda, K. and Schölkopf, B. |
Journal: | Proceedings of the 2nd Workshop on Data Mining using Matrices and Tensors (DMMT 2009) |
Pages: | 32-41 |
Year: | 2009 |
Month: | June |
Day: | 0 |
Editors: | C Ding and T Li |
Publisher: | ACM Press |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | New York, NY, USA |
DOI: | 10.1145/1581114.1581118 |
Event Name: | 2nd Workshop on Data Mining using Matrices and Tensors (DMMT/KDD 2009) |
Event Place: | Paris, France |
Digital: | 0 |
Electronic Archiving: | grant_archive |
ISBN: | 978-1-60558-673-1 |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
BibTex
@inproceedings{5932, title = {Multi-way set enumeration in real-valued tensors}, journal = {Proceedings of the 2nd Workshop on Data Mining using Matrices and Tensors (DMMT 2009)}, abstract = {The analysis of n-ary relations receives attention in many different fields, for instance biology, web mining, and social studies. In the basic setting, there are n sets of instances, and each observation associates n instances, one from each set. A common approach to explore these n-way data is the search for n-set patterns. An n-set pattern consists of specific subsets of the n instance sets such that all possible n- ary associations between the corresponding instances are observed. This provides a higher-level view of the data, revealing associative relationships between groups of instances. Here, we generalize this approach in two respects. First, we tolerate missing observations to a certain degree, that means we are also interested in n-sets where most (although not all) of the possible combinations have been recorded in the data. Second, we take association weights into account. More precisely, we propose a method to enumerate all n- sets that satisfy a minimum threshold with respect to the average association weight. Non-observed associations obtain by default a weight of zero. Technically, we solve the enumeration task using a reverse search strategy, which allows for effective pruning of the search space. In addition, our algorithm provides a ranking of the solutions and can consider further constraints. We show experimental results on artificial and real-world data sets from different domains.}, pages = {32-41}, editors = {C Ding and T Li}, publisher = {ACM Press}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, address = {New York, NY, USA}, month = jun, year = {2009}, slug = {5932}, author = {Georgii, E. and Tsuda, K. and Sch{\"o}lkopf, B.}, month_numeric = {6} }