نتایج جستجو برای: k means cluster
تعداد نتایج: 880962 فیلتر نتایج به سال:
Uniform deviation bounds limit the difference between a model’s expected loss and its loss on an empirical sample uniformly for all models in a learning problem. As such, they are a critical component to empirical risk minimization. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are unbounded. In our main application, this allows us to ob...
One of the most popular algorithms for clustering in Euclidean space is the k-means algorithm; k-means is difficult to analyze mathematically, and few theoretical guarantees are known about it, particularly when the data is well-clustered. In this paper, we attempt to fill this gap in the literature by analyzing the behavior of k-means on well-clustered data. In particular, we study the case wh...
This work explores the use of statistical techniques, namely stratified sampling and cluster analysis, as powerful tools for deriving traffic properties at the flow level. Our results show that the adequate selection of samples leads to significant improvements allowing further important statistical analysis. Although stratified sampling is a well-known technique, the way we classify the data p...
Although the study of clustering is centered around an intuitively compelling goal, it has been very difficult to develop a unified framework for reasoning about it at a technical level, and profoundly diverse approaches to clustering abound in the research community. Here we suggest a formal perspective on the difficulty in finding such a unification, in the form of an impossibility theorem: f...
It has been claimed that tone language speakers use less F0 related cues in the production of verbal expressions of emotions. This is because F0 is used in the production of lexical tones. This study investigated this claim by examining how F0 and various other acoustic parameters are used in the production of verbal emotion expressions in Cantonese (tone language) compared to English (non-tone...
To analyze topics of a large number of web events, we proposed an event topic analysis approach by topic feature clustering and extended LDA (latent dirichlet allocation) model. The extended LDA model is dimension LDA (DLDA) which integrates topic probability of LDA model. We represent an event as a multi-dimensions vector and use DLDA model to select topic feature words in events. We aggregate...
This article examines several data mining approaches that perform short time series analysis. The basis of the methods is formed by clustering algorithms with or without modifications. The proposed methods implement short time series analysis when the numbers of the observations are not equal and the historical information is short. The inspected approaches are offered for solving complex tasks...
Author Diarization is a new task introduced in PAN’16, to identify portion(s) of text with in a document written by multiple authors. This paper presents, our proposed approach for author diarization task. Various types of stylistic features which include lexical features, used to uniquely identify an author. Furthermore, to find anomalous text with in a single document, ClustDist method used. ...
The k-means algorithm is the most popular nonparametric clustering method in use, but cannot generally be applied to data sets with missing observations. The usual practice with such data sets is to either impute the values under an assumption of a missing-at-random mechanism or to ignore the incomplete records, and then to use the desired clustering method. We develop an efficient version of t...
Clustering is used to generate groupings of data from a large dataset, with the intention of representing the behavior of a system as accurately as possible. In this sense, clustering is applied in this work to extract useful information from the electricity price time series. To be precise, two clustering techniques, K-means and Expectation Maximization, have been utilized for the analysis of ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید