data sampling

نتایج جستجو برای: data sampling

تعداد نتایج: 2525723 فیلتر نتایج به سال:

Data Interpolation: An Efficient Sampling Alternative for Big Data Aggregation

Journal: :CoRR 2012

Hadassa Daltrophe Shlomi Dolev Zvi Lotker

Given a large set of measurement sensor data, in order to identify a simple function that captures the essence of the data gathered by the sensors, we suggest representing the data by (spatial) functions, in particular by polynomials. Given a (sampled) set of values, we interpolate the datapoints to define a polynomial that would represent the data. The interpolation is challenging, since in pr...

متن کامل

Sampling TCP Data-Path Quality with TCP Data Probes

2009

Rocky K. C. Chang Edmond W. W. Chan Xiapu Luo

In this paper, we present preliminary results of measuring TCP data-path quality using a new measurement tool called OneProbe. Unlike the existing tools, OneProbe uses legitimate TCP data probes to profile TCP data-path quality by sampling round-trip delay, one-way loss rate, and one-way reordering rate at the same time. This paper presents a set of recent measurement studies on a set of web se...

متن کامل

Combining Probability and Non-Probability Sampling Methods: Model-Aided Sampling and the O*NET Data Collection Program

Journal: :Survey Practice 2009

متن کامل

An Effective Data Sampling Procedure for Imbalanced Data Learning on Health Insurance Fraud Detection

Journal: :Journal of Computing and Information Technology 2021

Fraud detection has received considerable attention from many academic research and industries worldwide due to its increasing popularity. Insurance datasets are enormous, with skewed distributions high dimensionality. Skewed class distribution volume considered significant problems while analyzing insurance datasets, as these issues increase the misclassification rates. Although sampling appro...

متن کامل

Preferential sampling for presence/absence data and for fusion of presence/absence data with presence‐only data

Journal: :Ecological Monographs 2019

متن کامل

Comparison of Data Sampling Approaches for Imbalanced Bioinformatics Data

2014

David J. Dittman Taghi M. Khoshgoftaar Randall Wald Amri Napolitano

Class imbalance is a frequent problem found in bioinformatics datasets. Unfortunately, the minority class is usually also the class of interest. One of the methods to improve this situation is data sampling. There are a number of different data sampling methods, each with their own strengths and weaknesses, which makes choosing one a difficult prospect. In our work we compare three data samplin...

متن کامل

Supervised sampling for clustering large data sets

2010

Ioannis Kosmidis

The problem of clustering large data sets has attracted a lot of current research. The approaches taken are mainly based either on the more efficient implementation or modification of existing methods or/and on the construction of clusters from a small sub-sample of the data and then the assignment of all observations in those clusters. The current paper focuses on the latter direction. An alte...

متن کامل

KSample: Dynamic Sampling Over Unbounded Data Streams

Journal: :JIDM 2015

Tiago Rodrigo Kepe Eduardo Cunha de Almeida Thomas Cerqueus

Data sampling over data streams is common practice to allow the analysis of data in real-time. However, sampling over data streams becomes complex when the stream does not fit in memory, and worse yet, when the length of the stream is unknown. A well-known technique for sampling data streams is the Reservoir Sampling. It requires a fixed-size reservoir that corresponds to the resulting sample s...

متن کامل

Static Versus Dynamic Sampling for Data Mining

1996

George H. John Pat Langley

As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this end, we introduce the “Probably Close Eno...

متن کامل

Deterministic algorithms for sampling count data

Journal: :Data Knowl. Eng. 2008

Hüseyin Akcan Alex Astashyn Hervé Brönnimann

Processing and extracting meaningful knowledge from count data is an important problem in data mining. The volume of data is increasing dramatically as the data is generated by day-to-day activities such as market basket data, web clickstream data or network data. Most mining and analysis algorithms require multiple passes over the data, which requires extreme amounts of time. One solution to s...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید