• Volume 6,Issue 4,2012 Table of Contents
    Select All
    Display Type: |
    • >Special Issue of Advances on E-Commerce Techniques in Big Data Era
    • Preface

      2012, 6(4):473-474.

      Abstract (2154) HTML (0) PDF 88.15 K (2471) Comment (0) Favorites

      Abstract:We are in an era of "Big Data". The rapid accumulation of data of trading transactions, user feedbacks and social media poses serious challenges and brings new opportunities to E-commerce research and applications. It is under this context that the 2nd Nanjing International Summit Forum on e-Commerce was held on 29th October, 2011. This forum attracted more than 300 attendees from government, education, industry and academia. Seven established domestic and overseas scholars from database, data mining and machine learning areas gave keynote speeches on the forum. This special issue is composed of five invited contributions based on the corresponding keynote speeches. It covers various technical aspects of E-commerce, specifically, including skyline computation, Microblog information extraction, clustering, classification and service negotiation.

    • Skyline: Stacking Optimal Solutions in Exact and Uncertain Worlds

      2012, 6(4):475-493.

      Abstract (2941) HTML (0) PDF 1.09 M (2899) Comment (0) Favorites

      Abstract:In many applications involving multiple criteria optimal decision making, users may often want to make a personal trade-off among all optimal solutions for selecting one object that best fits their personal needs. As a key feature, skyline in a multi-dimensional space provides a minimal set of candidates for such purposes by removing every object that is not preferred by any (monotonic) utility/scoring function; that is, the skyline removes all objects not preferred by any user no matter how their preferences vary. Due to its importance, the problem of skyline computation and its variants have been extensively studied in the database literature. In this paper, we provide a comprehensive survey of skyline computation techniques. Specifically, we first introduce the skyline computation algorithms on traditional (exact) data where each object corresponds to a point in a multi-dimensional space. Then, we discuss the skyline models and effcient algorithms to handle uncertain data which is inherent in many important applications. Finally, we briefly describe a few variants of the skyline (e.g., skycube, k-skyband and reverse skyline) in this paper.

    • Information Extraction From Microblogs: A Survey

      2012, 6(4):495-522.

      Abstract (5346) HTML (0) PDF 259.35 K (20452) Comment (0) Favorites

      Abstract:Microblogging(e.g. Twitter, http://twitter.com), as a new form of online communication in which users talk about their daily lives, publish opinions or share information by short posts, has become one of the most popular social networking services today, which makes it potentially a large information base attracting increasing attention of researchers in the field of knowledge discovery and data mining. In this paper, we conduct a survey about existing research on information extraction from microblogging services and their applications, and then address some promising future works. We specifically analyze three types of information: personal, social and travel information.

    • Fast Co-Clustering by Ranking and Sampling

      2012, 6(4):523-534.

      Abstract (2074) HTML (0) PDF 1.73 M (3736) Comment (0) Favorites

      Abstract:Co-clustering treats a data matrix in a symmetric fashion that a partitioning of rows can induce a partitioning of columns, and vice versa. It has been shown advantageous over tradition clustering. However, the computational complexity of most co-clustering algorithms are costly, and thus limit their e?ectiveness on large datasets. A recently proposed sampling-based matrix decomposition method can achieve a linear computational complexity, but selected rows and columns can not effectively represent a large sparse dataset, and many unselected rows and columns can not be mapped to the selected rows and columns because they do not share features in common, thus its performance is impaired. To address this problem, we propose a fast co-clustering framework by ranking and sampling that only representative samples are selected for co-clustering, and the remaining samples can be easily labeled by their neighbors in clustered samples. Extensive experiments on large text datasets show that our approach is able to use very few samples to achieve comparable results in linear time compared to state-of-the-art co-clustering algorithms of nonlinear computational complexity.

    • Classifying Incomplete Data Using Group Difference Detection with Parimputation Approach

      2012, 6(4):535-552.

      Abstract (2519) HTML (0) PDF 6.61 M (2612) Comment (0) Favorites

      Abstract:We propose an effcient approach for classifying insu±cient dataset with missing data (incomplete data) with group di?erence detection. Specifically, missing data in an insuffcient dataset are first completed with the parimputation strategy. And then, the insuffcient dataset is grouped by contrasting with a known dataset (transfer learning). Finally, for assessing the quality of the induced models, empirical likelihood (EL) inference is used to estimate the confidence intervals of structural differences between the insuffcient dataset and the known dataset. In such a way of mining, classifying incomplete data can be beneficial to industries as it will provide easier and smarter use of information. This will include evaluating a new medical product by detecting differences between the new product and an old one for pharmaceutical companies and, identifying frauds by detecting abnormal operations. To experimentally illustrate the benefits, we evaluate the proposed approach using UCI datasets, and demonstrate that our method works much better than the boot-strap resampling method on, for example, distinguishing spam from non-spam emails; and the benign breast cancer from the malign one.

    • Cooperative-Competitive Healthcare Service Negotiation

      2012, 6(4):553-570.

      Abstract (2581) HTML (0) PDF 3.23 M (3313) Comment (0) Favorites

      Abstract:Service negotiation is a complex activity, especially in complex domains such as healthcare. The provision of healthcare services typically involves the coordination of several professionals with different skills and locations. There is usually negotiation between healthcare service providers as different services have specific constraints, variables, and features (scheduling, waiting lists, availability of resources, etc.), which may conflict with each other. While automating the negotiation processes by using software can improve the effciency and quality of healthcare services, most of the existing negotiation automations are positional bargaining in nature, and are not suitable for complex scenarios in healthcare services. This paper proposes a cooperative-competitive negotiation model that enables negotiating parties to share their knowledge and work toward optimal solutions. In this model, patients and healthcare providers work together to develop a patient-centered treatment plan. We further automate the new negotiation model with software agents.