Proceedings of the Tenth SIAM International Conference on Data Mining

Srinivasan Parthasarathy, Bing Liu, Bart Goethals, Jian Pei, and Chandrika Kamath, Editors


2010 / 953 pages / CD / ISBN: 978-0-898717-03-7 / List Price $174.50 / Member Price $122.15 / Order Code PR 136

Symposium held in Columbus, OH, April 29-May 1, 2010.

1 Text Categorization Using Word Similarities Based on Higher Order Co-occurrences
Syed Fawad Hussain and Gilles Bisson

13 Exploiting Associations between Word Clusters and Document Classes for Cross-domain Text Categorization
Fuzhen Zhuang, Ping Luo, Hui Xiong, Qing He, Yuhong Xiong, and Zhongzhi Shi

25 Semi-Supervised Bio-Named Entity Recognition with Word-Codebook Learning
Pavel P. Kuksa and Yanjun Qi

37 Improving Accessibility of Transaction-centric Web Objects
Muhammad Asiful Islam, Faisal Ahmed, Yevgen Borodin, Jalal Mahmud, and I. V. Ramakrishnan

49 Reconstructing Randomized Social Networks
Niko Vuokko and Evimaria Terzi

60 Reconstruction from Randomized Graph via Low Rank Approximation
Leting Wu, Xiaowei Ying, and Xintao Wu

72 Do You Trust to Get Trust? A Study of Trust Reciprocity Behaviors and Reciprocal Trust Prediction
Viet-An Nguyen, Ee-Peng Lim, Hwee-Hoon Tan, Jing Jiang, and Aixin Sun

84 Publishing Skewed Sensitive Microdata
Yabo Xu, Ke Wang, Ada Wai-Chee Fu, and Raymond Chi-Wing Wong

94 A SAT-based Framework for Efficient Constrained Clustering
Ian Davidson, S. S. Ravi, and Leonid Shamis

106 Spectral and Semidefinite Relaxation of the CLUHSIC Algorithm
Wen-Yun Yang, James T. Kwok, and Bao-Liang Lu

118 Generation of Alternative Clusterings Using the CAMI Approach
Xuan Hong Dang and James Bailey

130 Making k-means Even Faster
Greg Hamerly

141 On Mining Statistically Significant Attribute Association Information
Pritam Chanda, Jianmei Yang, Aidong Zhang, and Murali Ramanathan

153 An Information-Theoretic Approach to Finding Informative Noisy Tiles in Binary Databases
Kleanthis-Nikolaos Kontonasios and Tijl De Bie

165 Mining Top-K Patterns from Binary Datasets in Presence of Noise
Claudio Lucchese, Salvatore Orlando, and Raffaele Perego

177 Formal Concept Sampling for Counting and Threshold-Free Local Pattern Mining
Mario Boley, Thomas Gärtner, and Henrik Grosskreutz

189 Alleviating the Sparsity Problems in Collaborative Filtering by Using an Adapted Distance and a Graph-based Method
Beau Piccart, Jan Struyf, and Henrik Blockeel

199 Collaborative Filtering: Weighted Nonnegative Matrix Factorization Incorporating User and Item Graphs
Quanquan Gu, Jie Zhou, and Chris Ding

211 Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization
Liang Xiong, Xi Chen, Tzu-Kuo Huang, Jeff Schneider, and Jaime G. Carbonell

223 Residual Bayesian Co-clustering for Matrix Approximation
Hanhuai Shan and Arindam Banerjee

235 Two-view Transductive Support Vector Machines
Guangxia Li, Steven C. H. Hoi, and Kuiyu Chang

245 Fast Stochastic Frank-Wolfe Algorithms for Nonlinear SVMs
Hua Ouyang and Alexander Gray

257 Single-Pass Distributed Learning of Multi-Class SVMs Using Core-Sets
Stefano Lodi, Ricardo Ñanculef, and Claudo Sartori

269 Nonnegative Principal Component Analysis for Proteomic Tumor Profiles
Xiaoxu Han

281 Efficient Nonnegative Matrix Factorization with Random Projections
Fei Wang and Ping Li

293 Bridging Domains with Words: Opinion Analysis with Matrix Tri-factorizations
Tao Li, Vikas Sindhwani, Chris Ding, and Yi Zhang

303 Exact Passive-Aggressive Algorithm for Multiclass Classification Using Support Class
Shin Matsushima, Nobuyuki Shimizu, Kazuhiro Yoshida, Takashi Ninomiya, and Hiroshi Nakagawa

315 Robust Mining of Time Intervals with Semi-Interval Partial Order Patterns
Fabian Moerchen and Dmitriy Fradkin

327 Cascading Spatio-Temporal Pattern Discovery: A Summary of Results
Pradeep Mohan, Shashi Shekhar, James A. Shine, and James P. Rogers

339 Frequentness-Transition Queries for Distinctive Pattern Mining from Time-Segmented Databases
Shin-ichi Minato and Takeaki Uno

350 Consecutive Ones Property and Spectral Ordering
Niko Vuokko

361 Naïve Bayes Classifier for Positive Unlabeled Learning with Uncertainty
Jiazhen He, Yang Zhang, Xue Li, and Yong Wang

373 On Multidimensional Sharpening of Uncertain Data
Charu C. Aggarwal

385 Subspace Clustering for Uncertain Data
Stephan Günnemann, Hardy Kremer, and Thomas Seidl

397 On the Use of Combining Rules in Relational Probability Trees
Daan Fierens

409 The Application of Statistical Relational Learning to a Database of Criminal and Terrorist Activity
B. Delaney, A. Fast, W. Campbell, C. Weinstein, and D. Jensen

418 ContexTour: Contextual Contour Analysis on Dynamic Multi-Relational Clustering
Yu-Ru Lin, Jimeng Sun, Nan Cao, and Shixia Liu

430 Indentifying Multi-Instance Outliers
Ou Wu, Jun Gao, Weiming Hu, Bing Li, and Mingliang Zhu

442 Mining Actionable Subspace Clusters in Sequential Data
Kelvin Sim, Ardian Kristanto Poernomo, and Vivekanand Gopalkrishnan

454 GraSS: Graph Structure Summarization
Kristen LeFevre and Evimaria Terzi

466 Mining Frequent Graph Sequence Patterns Induced by Vertices
Akihiro Inokuchi and Takashi Washio

478 On Clustering Graph Streams
Charu C. Aggarwal, Yuchen Zhao, and Philip S. Yu

490 Inferring Probability Distributions of Graph Size and Node Degree from Stochastic Graph Grammars
Sourav Mukherjee and Tim Oates

502 p-ISOMAP: An Efficient Parametric Update for ISOMAP for Visual Analytics
Jaegul Choo, Chandan K. Reddy, Hanseung Lee, and Haesun Park

514 Confidence-Based Feature Acquisition to Minimize Training and Test Costs
Marie desJardins, James MacGlashan, and Kiri L. Wagstaff

525 Co-Selection of Features and Instances for Unsupervised Rare Category Analysis
Jingrui He and Jamie Carbonell

537 Active Ordering of Interactive Prediction Tasks
Abhimanyu Lad and Yiming Yang

548 Radius Plots for Mining Tera-Byte Scale Graphs: Algorithms, Patterns, and Observations
U Kang, Charalampos E. Tsourakakis, Ana Paula Appel, Christos Faloutsos, and Jure Leskovec

559 Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization
Jérôme Kunegis, Stephan Schmidt, Andreas Lommatzsch, Jürgen Lerner, Ernesto W. De Luca, and Sahin Albayrak

571 Fast Single-Pair SimRank Computation
Pei Li, Hongyan Liu, Jeffrey Xu Yu, Jun He, and Xiaoyong Du

583 A Heterogeneous Label Propagation Algorithm for Disease Gene Discovery
TaeHyun Hwang and Rui Kuang

595 Direct Density Ratio Estimation with Dimensionality Reduction
Masashi Sugiyama, Satoshi Hara, Paul von Bünau, Taiji Suzuki, Takafumi Kanamori, and Motoaki Kawanabe

607 The Generalized Dimensionality Reduction Problem
Charu C. Aggarwal

619 Convex Principal Feature Selection
Mahdokht Masaeli, Yan Yan, Ying Cui, Glenn Fung, Jennifer G. Dy

629 Generalized and Heuristic-Free Feature Construction for Improved Accuracy
Wei Fan, Erheng Zhong, Jing Peng, Olivier Verscheure, Kun Zhang, Jiangtao Ren, Rong Yan, and Qiang Yang

641 Unsupervised Discovery of Abnormal Activity Occurrences in Multi-Dimensional Time Series, with Applications in Wearable Systems
Alireza Vahdatpour and Majid Sarrafzadeh

653 An Integrated Framework for Simultaneous Classification and Regression of Time-Series Data
Zubin Abraham and Pang-Ning Tan

665 Multiresolution Motif Discovery in Time Series
Nuno Castro and Paulo Azevedo

677 Time-Series Classification in Many Intrinsic Dimensions
Miloš Radovanović, Alexandros Nanopoulos, Mirjana Ivanović

689 MACH: Fast Randomized Tensor Decompositions
Charalampos E. Tsourakakis

701 Scalable Tensor Factorizations with Missing Data
Evrim Acar, Daniel M. Dunlavy, Tamara G. Kolda, Morten Mørup

713 On Low-Rank Updates to the Singular Value and Tucker Decompositions
Michael J. O’Hara

720 Towards Finding Valuable Topics
Zhen Wen and Ching-Yung Lin

732 Predicting Customer Churn in Mobile Networks Through Analysis of Social Groups
Yossi Richter, Elad Yom-Tov, and Noam Slonim

742 Directed Network Community Detection: A Popularity and Productivity Link Model
Tianbao Yang, Yun Chi, Shenghuo Zhu, Yihong Gong, and Rong Jin

754 HCDF: A Hybrid Community Discovery Framework
Keith Henderson, Tina Eliassi-Rad, Spiros Papadimitriou, and Christos Faloutsos

766 A Robust Decision Tree Algorithm for Imbalanced Data Sets
Wei Liu, Sanjay Chawla, David A. Cieslak, and Nitesh V. Chawla

778 Multi-Label Classification without the Multi-Label Cost
Xiatian Zhang, Quan Yuan, Shiwan Zhao, Wei Fan, Wentao Zheng, and Zhong Wang

790 Fast and Accurate Gene Prediction by Decision Tree Classification
Rong She, Jeffrey Shih-Chieh Chu, Ke Wang, and Nansheng Chen

802 On Classification of High-Cardinality Data Streams
Charu C. Aggarwal and Philip S. Yu

814 Predictive Modeling with Heterogeneous Sources
Xiaoxiao Shi, Qi Liu, Wei Fan, Qiang Yang, and Philip S. Yu

826 A Probabilistic Framework to Learn from Multiple Annotators with Time-Varying Accuracy
Pinar Donmez, Jaime Carbonell, and Jeff Schneider

838 An Integrative Approach to Indentifying Biologically Relevant Genes
Zheng Zhao, Jiangxin Wang, Shashvata Sharma, Nitin Agarwal, Huan Lou, and Yung Chang

850 A Compression Based Distance Measure for Texture
Bilson J. L. Campana and Eamonn J. Keogh

862 Fast Implementation of ℓ1 Regularized Learning Algorithms Using Gradient Descent Methods
Yunpeng Cai, Yijun Sun, Yubo Cheng, Jian Li, and Steve Goodison

872 Learning Compressible Models
Yi Zhang, Jeff Schneider, and Artur Dubrawski

882 A Permutation Approach to Validation
Malik Magdon-Ismail and Konstantin Mertsalov

894 Adaptive Informative Sampling for Active Learning
Zhenyu Lu, Xindong Wu, and Josh Bongard

906 Evaluating Query Result Significance in Databases via Randomizations
Markus Ojala, Gemma C. Garriga, Aristides Gionis, and Heikki Mannila

918 Cross-Selling Optimization for Customized Promotion
Nan Li, Yinghui Yang, and Xifeng Yan

930 A Generalized Tree Matching Algorithm Considering Nested Lists for Web Data Extraction
Nitin Jindal and Bing Liu

942 Mining Maximally Banded Matrices in Binary Data
Faris Alqadah, Raj Bhatnagar, Anil Jegga




