linear speedup

نتایج جستجو برای: linear speedup

تعداد نتایج: 490347 فیلتر نتایج به سال:

Malleable task-graph scheduling with a practical speed-up model

2016

Loris Marchal Bertrand Simon Oliver Sinnen Frédéric Vivien

Scientific workloads are often described by Directed Acyclic task Graphs. Indeed, DAGs represent both a theoretical model and the structure employed by dynamic runtime schedulers to handle HPC applications. A natural problem is then to compute a makespan-minimizing schedule of a given graph. In this paper, we are motivated by task graphs arising from multifrontal factorizations of sparse matric...

متن کامل

Fast Implementation of Morphological Filtering Using ARM NEON Extension

2017

Elena Limonova Arseny Terekhin Dmitry Nikolaev Vladimir Arlazarov

In this paper we consider speedup potential of morphological image filtering on ARM processors. Morphological operations are widely used in image analysis and recognition and their speedup in some cases can significantly reduce overall execution time of recognition. More specifically, we propose fast implementation of erosion and dilation using ARM SIMD extension NEON. These operations with the...

متن کامل

An efficient quantum algorithm for generative machine learning

Journal: :CoRR 2017

Xun Gao Zhengyu Zhang Luming Duan

A central task in the field of quantum computing is to find applications where quantum computer could provide exponential speedup over any classical computer [1–3]. Machine learning represents an important field with broad applications where quantum computer may offer significant speedup [4–8]. Several quantum algorithms for discriminative machine learning [9] have been found based on efficient...

متن کامل

Hyper-Sparsity in the Revised Simplex Method and How to Exploit it

Journal: :Comp. Opt. and Appl. 2005

J. A. Julian Hall K. I. M. McKinnon

The revised simplex method is often the method of choice when solving large scale sparse linear programming problems, particularly when a family of closely-related problems is to be solved. Each iteration of the revised simplex method requires the solution of two linear systems and a matrix vector product. For a significant number of practical problems the result of one or more of these operati...

متن کامل

Towards Low-Cost, High-Accuracy Classifiers for Linear Solver Selection

2009

Sanjukta Bhowmick Brice Toth Padma Raghavan

The time to solve linear systems depends to a large extent on the choice of the solution method and the properties of the coefficient matrix. Although there are several linear solution methods, in most cases it is impossible to predict apriori which linear solver would be best suited for a given linear system. Recent investigations on selecting linear solvers for a given system have explored th...

متن کامل

Temperature-Aware Leakage Estimation Using Piecewise Linear Power Models

Journal: :IEICE Transactions 2010

Yongpan Liu Huazhong Yang

Due to the superlinear dependence of leakage power consumption on temperature, and spatial variations in on-chip thermal profiles, methods of leakage power estimation that are known to be accurate require detailed knowledge of thermal profiles. Leakage power depends on the integrated circuit (IC) thermal profile and circuit design style. Here, we show that piecewise linear models can be used to...

متن کامل

Parallel Linear Search with no Coordination for a Randomly Placed Treasure

Journal: :CoRR 2016

Amos Korman Yoav Rodeh

In STOC’16, Fraigniaud et al. consider the problem of finding a treasure hidden in one of many boxes that are ordered by importance. That is, if a treasure is in a more important box, then one would like to find it faster. Assuming there are many searchers, the authors suggest that using an algorithm that requires no coordination between searchers can be highly beneficial. Indeed, besides savin...

متن کامل

Speedup, Communication Complexity, and Blocking - A La Recherche du Temps Perdu

1993

Dan C. Marinescu John R. Rice

The paper investigates the time lost in a parallel computation due to sequential and duplicated work, communication and control, and blocking. It introduces the concept of relative speedup and proposes characterizations of parallel algorithms based upon the communication complexity and the blocking model. The paper discusses the impact of the processor's architecture upon the measured speedup. ...

متن کامل

An MPI-CUDA Implementation and Optimization for Parallel Sparse Equations and Least Squares (LSQR)

2012

He Huang Liqiang Wang En-Jui Lee Po Chen

LSQR (Sparse Equations and Least Squares) is a widely used Krylov subspace method to solve large-scale linear systems in seismic tomography. This paper presents a parallel MPI-CUDA implementation for LSQR solver. On CUDA level, our contributions include: (1) utilize CUBLAS and CUSPARSE to compute major steps in LSQR; (2) optimize memory copy between host memory and device memory; (3) develop a ...

متن کامل

Optimized Cutting Plane Algorithm for Large-Scale Risk Minimization

Journal: :Journal of Machine Learning Research 2009

Vojtech Franc Sören Sonnenburg

We have developed an optimized cutting plane algorithm (OCA) for solving large-scale risk minimization problems. We prove that the number of iterations OCA requires to converge to a ε precise solution is approximately linear in the sample size. We also derive OCAS, an OCA-based linear binary Support Vector Machine (SVM) solver, and OCAM, a linear multi-class SVM solver. In an extensive empirica...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید