نتایج جستجو برای: data stream algorithm

تعداد نتایج: 2964091  

2015
Marwan Hassani Thomas Seidl

Clustering validation is a crucial part of choosing a clustering algorithm which performs best for an input data. Internal clustering validation is efficient and realistic, whereas external validation requires a ground truth which is not provided in most applications. In this paper, we analyze the properties and performances of eleven internal clustering measures. In particular, as the importan...

2009
Show-Jane Yen Yue-Shi Lee Cheng-Wei Wu Chin-Lin Lin

Data mining refers to the process of revealing unknown and potentially useful information from a large database. Frequent itemsets mining is one of the foundational problems in data mining, which is to discover the set of products that purchased frequently together by customers from a transaction database. However, there may be a large number of patterns generated from database, and many of the...

2016
N. Sivakumar

Abstract—A new model for online machine learning process of high speed data stream is proposed, to minimize the severe restrictions associated with the existing computer learning algorithms. Most of the existing models have three principle steps. In the first step, the system would create a model incrementally. In the second step the time taken by the examples to complete a prescribed procedure...

2013
Davide Simoncelli Maurizio Dusi Francesco Gringoli Saverio Niccolini

To cope with real-time data analysis as the amount of data being exchanged over the network increases, an idea is to re-design algorithms originally implemented on the monitoring probe to work in a distributed manner over a stream-processing platform. In this paper we show preliminary performance analysis of a Twitter trending algorithm when running over BlockMon, an open-source monitoring plat...

2018
Panagiotis Bouros Nikos Mamoulis

Interval joins find applications in several domains, including temporal and spatial databases, uncertain data management, streaming data processing. In this paper, we study the evaluation of an interval count semi-join (ICS J ) operation that can be used for selecting or ranking intervals based on the number of join pairs they appear in. We extend the state-of-the-art algorithm for interval joi...

2011
Michael Hahsler Margaret H. Dunham

This paper describes one of the first attempts to model the temporal structure of massive data streams in real-time using data stream clustering. Recently, many data stream clustering algorithms have been developed which efficiently find a partition of the data points in a data stream. However, these algorithms disregard the information represented by the temporal order of the data points in th...

2002
Moses Charikar Kevin Chen Martin Farach-Colton

We present a 1-pass algorithm for estimating the most frequent items in a data stream using very limited storage space. Our method relies on a novel data structure called a count sketch, which allows us to estimate the frequencies of all the items in the stream. Our algorithm achieves better space bounds than the previous best known algorithms for this problem for many natural distributions on ...

Journal: :CoRR 2016
Jorge Luis Rivero Pérez Yaimara Peñate Santana Pedro Harenton Martínez López

Data mining has been widely used to identify potential customers for a new product or service. In this article is done a study of previous work relating to the application of data mining methodologies for software projects, specifically for direct marketing projects. Several data sets of demographic and historical customer purchases data available for evaluation of algorithms in this area, some...

2010
Haixun Wang Philip S. Yu Jiawei Han

Knowledge discovery from infinite data streams is an important and difficult task.We are facing two challenges, the overwhelming volume and the concept drifts of the streaming data. In this chapter, we introduce a general framework for mining concept-drifting data streams using weighted ensemble classifiers. We train an ensemble of classification models, such as C4.5, RIPPER, naive Bayesian, et...

Journal: :Data Knowl. Eng. 2009
Nishad Manerikar Themis Palpanas

The problem of detecting frequent items in streaming data is relevant to many different applications across many domains. Several algorithms, diverse in nature, have been proposed in the literature for the solution of the above problem. In this paper, we review these algorithms, and we present the results of the first extensive comparative experimental study of the most prominent algorithms in ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید