نتایج جستجو برای: high dimensional clustering

تعداد نتایج: 2463052  

2013
David C. Hunn Clark F. Olson

We present the results of a thorough evaluation of the subspace clustering algorithm SEPC using the OpenSubspace framework. We show that SEPC outperforms competing projected and subspace clustering algorithms on synthetic and some real world data sets. We also show that SEPC can be used to effectively discover clusters with overlapping objects (i.e., subspace clustering).

2009
Lutz Herrmann Alfred Ultsch

Swarm Based clustering (SBC) is a promising nature-inspired technique. A swarm of stochastic agents performs the task of clustering high-dimensional data on a low-dimensional output space. Most SBC methods are derivatives of the Ant Colony Clustering (ACC) approach proposed by Lumer and Faieta. Compared to clustering on Emergent Self-Organizing Maps (ESOM) these methods usually perform poorly i...

2012
Ivan Sudos

High dimensional data is often analysed resorting to its distribution properties in subspaces. Subspace clustering is a powerfull method for elicication of high dimensional data features. The result of subspace clustering can be an essential base for building indexing structures and further data search. However, a high number of subspaces and data instances can conceal a high number of subspace...

2012
Jinsong Leng Edith Cowan

Many real applications are required to detect outliers in high dimensional data sets. The major difficulty of mining outliers lies on the fact that outliers are often embedded in subspaces. No efficient methods are available in general for subspace-based outlier detection. Most existing subspacebased outlier detection methods identify outliers by searching for abnormal sparse density units in s...

2014
V. Atchaya C. Prakash

An efficient Actionable 3D Subspace Clustering based on Optimal Centroids from continuous valued data represented three dimensionally which is suitable for real world problems profitable stocks discovery , biologically significant protein residues etc. It achieves actionable patterns ,incorporation of domain knowledge which allows users to choose the preferred utility(profit/benefit) function, ...

2012
Stephan Günnemann

Clustering is an established data mining technique for grouping objects based on their mutual similarity. Since in today’s applications, however, usually many characteristics for each object are recorded, one cannot expect to find similar objects by considering all attributes together. In contrast, valuable clusters are hidden in subspace projections of the data. As a general solution to this p...

2009
Tat-Jun Chin Hanzi Wang David Suter

We present a novel and highly effective approach for multi-body motion segmentation. Drawing inspiration from robust statistical model fitting, we estimate putative subspace hypotheses from the data. However, instead of ranking them we encapsulate the hypotheses in a novel Mercer kernel which elicits the potential of two point trajectories to have emerged from the same subspace. The kernel perm...

2014
Jiti Gao Xiao Han Guangming Pan Yanrong Yang

Statistical inferences for sample correlation matrices are important in high dimensional data analysis. Motivated by this, this paper establishes a new central limit theorem (CLT) for a linear spectral statistic (LSS) of high dimensional sample correlation matrices for the case where the dimension p and the sample size n are comparable. This result is of independent interest in large dimensiona...

Journal: :Statistics and Computing 2014
Yi Yu Yang Feng

In high-dimensional data analysis, penalized likelihood estimators are shown to provide superior results in both variable selection and parameter estimation. A new algorithm, APPLE, is proposed for calculating the Approximate Path for Penalized Likelihood Estimators. Both convex penalties (such as LASSO) and folded concave penalties (such as MCP) are considered. APPLE efficiently computes the s...

2016
Kostas Stefanidis Eirini Ntoutsi

In this work, we address the problem of contextual recommendations by exploiting the concept of fault-tolerant subspace clustering. Specifically, we pre-partition users that have similarly rated subsets of data items into clusters and associate with each cluster a context situation. Context is defined as any internally stored information that can be used to characterize the data per se. Then, g...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید