نتایج جستجو برای: apache spark
تعداد نتایج: 18089 فیلتر نتایج به سال:
The number and the size of linked open data graphs keep growing at a fast pace and confronts semantic RDF services with problems characterized as Big data. Distributed query processing is one of them and needs to be efficiently addressed with execution guaranteeing scalability, high availability and fault tolerance. RDF data management systems requiring these properties are rarely built from sc...
We benchmarked Apache Spark with a popular parallel machine learning training application, Distributed Stochastic Gradient Descent for Matrix Factorization [5] and compared the Spark implementation with alternative approaches for communicating model parameters, such as scheduled pipelining using POSIX socket or MPI, and distributed shared memory (e.g. parameter server [13]). We found that Spark...
Access plan recommendation is a query optimization approach that executes new queries using prior created execution plans (QEPs). The optimizer divides the space into clusters in mentioned method. However, traditional clustering algorithms take significant amount of time for such large datasets. MapReduce distributed computing model provides efficient solutions storing and processing vast quant...
Real-time text processing systems are required in many domains to quickly identify patterns, trends, sentiments, and insights. Nowadays, social networks, e-commerce stores, blogs, scientific experiments, and server logs are main sources generating huge text data. However, to process huge text data in real time requires building a data processing pipeline. The main challenge in building such pip...
We introduce SparkCL, an open source unified programming framework based on Java, OpenCL and the Apache Spark framework. The motivation behind this work is to bring unconventional compute cores such as FPGAs/GPUs/APUs/DSPs and future core types into mainstream programming use. The framework allows equal treatment of different computing devices under the Spark framework and introduces the abilit...
Apache Spark is a popular framework for data analytics with attractive features such as fault tolerance and interoperability with the Hadoop ecosystem. Unfortunately, many analytics operations in Spark are an order of magnitude or more slower compared to native implementations written with high performance computing tools such as MPI. There is a need to bridge the performance gap while retainin...
Apache Spark is a popular framework for writing large scale data processing applications. Our long term goal is to develop automatic tools for reasoning about Spark programs. This is challenging because Spark programs combine database-like relational algebraic operations and aggregate operations, corresponding to (nested) loops, with User Defined Functions (UDFs). In this paper, we present a no...
In this article we present a new algorithm for creating simplicial Vietoris-Rips complexes that is easily parallelizable using computation models like MapReduce and Apache Spark. The algorithm does not involve any computation in homology spaces.
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید