نتایج جستجو برای: apache spark

تعداد نتایج: 18089  

2017
Ikram Ul Haq Eike Schallehn Xiao Chen

Entity Resolution is among the hottest topics in the field of Big data. It finds duplicates in datasets, which actually belong to same entity in the real world. Algorithms that perform Entity Resolution are computation intensive and consume a lot of time especially for large datasets. A lot of research has been conducted for improving Entity Resolution solutions. A number of algorithms are deve...

Journal: :Modeling and Analysis of Information Systems 2016

Journal: :Advances in Engineering Software 2021

Reference architectures for Big Data, machine learning and stream processing include not only recommended practices interconnected building blocks but considerations scalability, availability, manageability, security as well. However, the automated deployment of multi-VM platforms on various clouds leveraging such reference may raise several issues. The paper focuses particularly widespread Apa...

Journal: :International Journal of Engineering & Technology 2018

Journal: :IEEE Transactions on Big Data 2022

This article presents a new fast, highly scalable distributed matrix multiplication algorithm on Apache Spark, called Stark , based Strassen’s algorithm. Stark preserves seven multiplications scheme in environment and thus achieves asymptotically faster execution time. It creates recursion tree of computation where each level the corresponds to division combination blocks stored form Res...

2016
Harish S. Bhat R. W. M. A. Madushani Shagun Rawat

In this paper, we consider the problem of Bayesian filtering and inference for time series data modeled as noisy, discrete-time observations of a stochastic differential equation (SDE) with undetermined parameters. We develop a Metropolis algorithm to sample from the high-dimensional joint posterior density of all SDE parameters and state time series. Our approach relies on an innovative densit...

2016
Alexandros Baltas Andreas Kanavos Athanasios K. Tsakalidis

Sentiment Analysis on Twitter Data is a challenging problem due to the nature, diversity and volume of the data. In this work, we implement a system on Apache Spark, an open-source framework for programming with Big Data. The sentiment analysis tool is based on Machine Learning methodologies alongside with Natural Language Processing techniques and utilizes Apache Spark’s Machine learning libra...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید