apache spark

نتایج جستجو برای: apache spark

تعداد نتایج: 18089 فیلتر نتایج به سال:

NORA: Scalable OWL reasoner based on NoSQL databases and Apache Spark

Journal: :Software - Practice and Experience 2023

Abstract Reasoning is the process of inferring new knowledge and identifying inconsistencies within ontologies. Traditional techniques often prove inadequate when reasoning over large Knowledge Bases containing millions or billions facts. This article introduces NORA, a persistent scalable OWL reasoner built on top Apache Spark, designed to address challenges extensive complex NORA exploits sca...

متن کامل

Statistical analysis of the performance of four Apache Spark ML algorithms

Journal: :Journal of computer science and technology 2022

Feature selection (FS) techniques generally require repeatedly training and evaluating models to assess theimportance of each feature for a particular task. However, due the increasing size currently availabledatabases, distributed processing has become necessity many tasks. In this context, Apache SparkML library is one most widely used libraries performing classification other tasks with larg...

متن کامل

SparkGA2: Production-quality memory-efficient Apache Spark based genome analysis framework

Journal: :PLOS ONE 2019

متن کامل

Machine Learning Approach on Apache Spark for Credit Card Fraud Detection

Journal: :Ingénierie des systèmes d information 2020

متن کامل

Hybrid Machine Learning-Based Approach for Anomaly Detection using Apache Spark

Journal: :International Journal of Advanced Computer Science and Applications 2023

Over the past few decades, volume of data has increased significantly in both scientific institutions and universities, with a large number students enrolled high related data. Furthermore, network traffic post-pandemic use online learning. Therefore, processing is complex challenging task that increases possibility intrusions anomalies. Traditional security systems cannot deal such high-speed ...

متن کامل

Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache Spark

Journal: :ISPRS international journal of geo-information 2021

Aiming at the problem of spatial query processing in distributed computing systems, design and implementation new algorithms is a current challenge. Apache Spark memory-based framework suitable for real-time batch processing. Spark-based systems allow users to work on in-memory data, without worrying about data distribution mechanism fault-tolerance. Given two datasets points (called Query Trai...

متن کامل

Towards Engineering a Web-Scale Multimedia Service: A Case Study Using Spark

2017

Gylfi Þór Guðmundsson Laurent Amsaleg Björn Þór Jónsson Michael J. Franklin

Computing power has now become abundant with multi-core machines, grids and clouds, but it remains a challenge to harness the available power and move towards gracefully handling web-scale datasets. Several researchers have used automatically distributed computing frameworks, notably Hadoop and Spark, for processing multimedia material, but mostly using small collections on small clusters. In t...

متن کامل

Avoiding communication in primal and dual block coordinate descent methods

Journal: :CoRR 2016

Aditya Devarakonda Kimon Fountoulakis James Demmel Michael W. Mahoney

Primal and dual block coordinate descent methods are iterative methods for solving regularized and unregularized optimization problems. Distributed-memory parallel implementations of these methods have become popular in analyzing large machine learning datasets. However, existing implementations communicate at every iteration which, on modern data center and supercomputing architectures, often ...

متن کامل

Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

Journal: :CoRR 2016

Ahsan Javed Awan Mats Brorsson Vladimir Vlassov Eduard Ayguadé

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We ...

متن کامل

Query-able Kafka: An agile data analytics pipeline for mobile wireless networks

Journal: :PVLDB 2017

Eric Falk Vijay K. Gurbani Radu State

Due to their promise of delivering real-time network insights, today’s streaming analytics platforms are increasingly being used in the communications networks where the impact of the insights go beyond sentiment and trend analysis to include real-time detection of security attacks and prediction of network state (i.e., is the network transitioning towards an outage). Current streaming analytic...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید