Identifying Suspicious Bidders Utilizing Hierarchical Clustering and Decision Trees

نویسندگان

  • Benjamin J. Ford
  • Haiping Xu
  • Iren Valova
چکیده

Identifying bidders with suspicious bidding activities related to possible online auction fraud is a difficult task due to a large number of users participating in online auctions. In order to reduce the number of users to be investigated, we examine observable features of a bidder’s behavior, and utilize a hierarchical clustering technique to divide a collection of bidders into normal and deviant groups. Based on the clustering results, we generate a decision tree that can be used to efficiently characterize new bidders as normal, suspicious, or highly suspicious. To illustrate the effectiveness of our proposed approach, we collected real auction datasets from online auctions, and used 3-fold validation approach to show that the error rates of the generated decision trees are reasonably low.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Real-Time Self-Adaptive Classifier for Identifying Suspicious Bidders in Online Auctions

With the significant increase of available item listings in popular online auction houses nowadays, it becomes nearly impossible to manually investigate the large amount of auctions and bidders for shill bidding activities, which are a major type of auction fraud in online auctions. Automated mechanisms such as data mining techniques were proven to be necessary to process this type of increasin...

متن کامل

Classification and Cluster Analysis of Complex Time-of-Flight Secondary Ion Mass Spectrometry for Biological Samples

Identifying and separating subtly different biological samples is one of the most critical tasks in biological analysis. Time-of-flight secondary ion mass spectrometry (ToF-SIMS) is becoming a popular and important technique in the analysis of biological samples, because it can detect molecular information and characterize chemical composition. ToF-SIMS spectra of biological samples are enormou...

متن کامل

Clustering Trees with Instance Level Constraints

Constrained clustering investigates how to incorporate domain knowledge in the clustering process. The domain knowledge takes the form of constraints that must hold on the set of clusters. We consider instance level constraints, such as must-link and cannot-link. This type of constraints has been successfully used in popular clustering algorithms, such as k-means and hierarchical agglomerative ...

متن کامل

Inferring Hierarchical Clustering Structures by Deterministic Annealingby Deterministic Annealing

The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an ,aL”G.,P 4Lnrt;nn C.-e hM ,...

متن کامل

Inferring Hierarchical Clustering Structures by Deterministic Annealing

The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010