-
Building actions from classification rules
Abstract Rule induction has attracted a great deal of attention in Machine Learning and Data Mining. However, generating rules is not
an end in itself because their applicability is not straightforward especially when their number is large. Ideally, the ultimate
user would like to use these rules to decide which actions to undertake. In the literature, this notion is usually referred
to as actionability. We propose a new framework to address actionability. Our goal is to lighten the burden of analyzing a large set of classification
rules when the user is confronted to an ?unsatisfactory situation? and needs help to decide about the appropriate actions
to remedy to this situation. The method consists in comparing the situation to a set of classification rules. For this purpose,
we propose a new framework for learning action recommendations dealing with complex notions of feasibility and quality of
actions. Our approach has been motivated by an environmental application aiming at building a tool to help specialists in
charge of the management of a catchment to preserve stream-water quality. The results show the utility of this methodology
with regard to enhancing the actionability of a set of classification rules in a real-world application.
- Content Type Journal Article
- Category Regular Paper
- Pages 1-32
- DOI 10.1007/s10115-011-0466-5
- Authors
- Ronan Trépos, INRA, Unité BIA, BP 27, 31326 Castanet-Tolosan Cedex, France
- Ansaf Salleb-Aouissi, Center for Computational Learning Systems (CCLS), Columbia University, 475 Riverside Drive, New York, NY 10115, USA
- Marie-Odile Cordier, Université de Rennes 1/IRISA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France
- Véronique Masson, Université de Rennes 1/IRISA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France
- Chantal Gascuel-Odoux, INRA, UMR 1069, Sol Agro et hydrosystème Spatialisation, Agrocampus, 35000 Rennes, France
-
An new immune genetic algorithm based on uniform design sampling
Abstract The deficiencies of keeping population diversity, prematurity and low success rate of searching the global optimal solution
are the shortcomings of genetic algorithm (GA). Based on the bias of samples in the uniform design sampling (UDS) point set,
the crossover operation in GA is redesigned. Using the concentrations of antibodies in artificial immune system (AIS), the
chromosomes concentration in GA is defined and the clonal selection strategy is designed. In order to solve the maximum clique
problem (MCP), an new immune GA (UIGA) is presented based on the clonal selection strategy and UDS. The simulation results
show that the UIGA provides superior solution quality, convergence rate, and other various indices to those of the simple
and good point GA when solving MCPs.
- Content Type Journal Article
- Category Short Paper
- Pages 1-15
- DOI 10.1007/s10115-011-0476-3
- Authors
- Ben-Da Zhou, School of Applied Mathematics, West Anhui University, Lu?an, 237012 China
- Hong-Liang Yao, School of Computer Science and Technology, Hefei University of Technology, Hefei, 230009 China
- Ming-Hua Shi, School of Applied Mathematics, West Anhui University, Lu?an, 237012 China
- Qin Yue, School of Applied Mathematics, West Anhui University, Lu?an, 237012 China
- Hao Wang, School of Computer Science and Technology, Hefei University of Technology, Hefei, 230009 China
-
Modeling collective blogging dynamics of popular incidental topics
Abstract An extended susceptible-infective (SI) epidemic model is presented in this paper to describe the collective blogging behavior
on popular incidental topics. Our model has two major extensions over the classic SI model: in the new model, different blog
writers get interested in a specific topic with different probabilities, while in a classic SI model, the infection probability
of a disease between any two individuals is identical; the new model takes into consideration the impact of external mainstream
media on blog writers, while in a classical SI model, spreading of diseases is merely based on personal contacts between individuals.
The new model is capable of explaining the widely observed early burst and heavy tail of topic propagation velocity. The proposed
model has a closed-form solution when the individual interest is of uniform distribution with the external influence assumed
constant. We validate the proposed model using ten topics from two different data sets: Sina Blog and LiveJournal Blogspace,
the results indicating that our model fits the topic propagation velocity and predicts the propagation trend very well.
- Content Type Journal Article
- Category Regular Paper
- Pages 1-17
- DOI 10.1007/s10115-011-0470-9
- Authors
- Li Zhao, Center for Intelligent and Networked Systems and TNLIST Lab, Tsinghua University, Beijing, 100084 China
- Xiaohong Guan, Center for Intelligent and Networked Systems and TNLIST Lab, Tsinghua University, Beijing, 100084 China
- Ruixi Yuan, Center for Intelligent and Networked Systems and TNLIST Lab, Tsinghua University, Beijing, 100084 China
-
In-network outlier detection in wireless sensor networks
Abstract To address the problem of unsupervised outlier detection in wireless sensor networks, we develop an approach that (1) is flexible
with respect to the outlier definition, (2) computes the result in-network to reduce both bandwidth and energy consumption,
(3) uses only single-hop communication, thus permitting very simple node failure detection and message reliability assurance
mechanisms (e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data. We examine performance by simulation,
using real sensor data streams. Our results demonstrate that our approach is accurate and imposes reasonable communication
and power consumption demands.
- Content Type Journal Article
- Category Regular Paper
- Pages 1-32
- DOI 10.1007/s10115-011-0474-5
- Authors
- Joel W. Branch, Middleware and Application Transformation Department, IBM T. J. Watson Research Center, Hawthorne, NY 10532, USA
- Chris Giannella, The MITRE Corporation, 300 Sentinel Dr Suite 600, Annapolis Junction, MD 20701, USA
- Boleslaw Szymanski, Network Science and Technology Center and Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
- Ran Wolff, Department of Information Systems, University of Haifa, Haifa, Israel
- Hillol Kargupta, Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
-
Query-dependent cross-domain ranking in heterogeneous network
Abstract Traditional learning-to-rank problem mainly focuses on one single type of objects. However, with the rapid growth of the Web
2.0, ranking over multiple interrelated and heterogeneous objects becomes a common situation, e.g., the heterogeneous academic
network. In this scenario, one may have much training data for some type of objects (e.g. conferences) while only very few
for the interested types of objects (e.g. authors). Thus, the two important questions are: (1) Given a networked data set,
how could one borrow supervision from other types of objects in order to build an accurate ranking model for the interested
objects with insufficient supervision? (2) If there are links between different objects, how can we exploit their relationships
for improved ranking performance? In this work, we first propose a regularized framework called HCDRank to simultaneously
minimize two loss functions related to these two domains. Then, we extend the approach by exploiting the link information
between heterogeneous objects. We conduct a theoretical analysis to the proposed approach and derive its generalization bound
to demonstrate how the two related domains could help each other in learning ranking functions. Experimental results on three
different genres of data sets demonstrate the effectiveness of the proposed approaches.
- Content Type Journal Article
- Category Regular Paper
- Pages 1-37
- DOI 10.1007/s10115-011-0472-7
- Authors
- Bo Wang, Department of Computer Science, Nanjing University of Aeronautics and Astronautics, Nanjing, China
- Jie Tang, Department of Computer Science, Tsinghua University, 100084 Beijing, China
- Wei Fan, IBM T.J. Watson Research Center, New York, USA
- Songcan Chen, Department of Computer Science, Nanjing University of Aeronautics and Astronautics, Nanjing, China
- Chenhao Tan, Department of Computer Science, Tsinghua University, 100084 Beijing, China
- Zi Yang, Department of Computer Science, Tsinghua University, 100084 Beijing, China
|