processor blocking

Application Specific Processor Design for H.264 Decoder with a Configurable Embedded Processor

Journal: :ETRI Journal 2005

An Object-Oriented Library for Shared-Memory Parallel Simulations

1996

Philip Machanick

Programming shared-memory multiprocessor systems is becoming increasingly difficult as the gap between memory speed and processor speed increases. At the same time, this class of computer—based on standard microprocessors—is becoming increasingly common as an alternative to traditional mainframes and supercomputers. Programs that are not sympathetic to caches can perform poorly on such systems....

متن کامل

Computer Science Technical Report Experimental Evaluation of Blocking and Non-Blocking Multithreaded Code Execution

1997

Murali Annavaram Walid A. Najjar Lucas Roh

The objective of multithreaded execution models is masking the latency of inter processor communications and remote memory accesses in large-scale multiprocessors. Several such models combine aspects of data ow-like execution with the von Neumann model in an attempt to provide both e cient synchronization (as in the data ow model) and e cient exploitation of program locality (as in the von Neum...

متن کامل

an integrated temporal partitioning and mapping framework for improving performance of a reconfigurable instruction set processor

Journal: :journal of computer and robotics 0

farhad mehdipour faculty of information science and electrical engineering, department of informatics, kyushu university, fukuoka, japan hamid noori school of electrical and computer engineering, university of tehran, tehran, iran morteza saheb zamani department of computer engineering and it, amirkabir university of technology (tehran polytechnic), tehran, iran hiroaki honda institute of systems, information technologies and nanotechnologies, fukuoka, japan koji inoue faculty of information science and electrical engineering, department of informatics, kyushu university, fukuoka, japan kazuaki murakami faculty of information science and electrical engineering, department of informatics, kyushu university, fukuoka, japan

reconfigurable instruction set processors allow customization for an application domain by extending the core instruction set architecture. extracting appropriate custom instructions is an important phase for implementing an application on a reconfigurable instruction set processor. a custom instruction (ci) is usually extracted from critical portions of applications and implemented on a reconf...

متن کامل

ارزیابی معماری پیکرپذیر برای کاربرد های ‏‎dsp‎‏

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه تهران 1381

هادی خانی, محمدرضا موحدین, حمیدرضا شفیعی,

در دو دهه گذشته پردازنده ها ‏‎(dsp processor)‎‏ ، بازار تراشه های همه منظوره محاسبات ‏‎dsp‎‏ را در اختیار داشته اند . پیشرفت در ساخت مدارهای دیجیتال ، افزایش تعداد گیتهای منطقی قابل پیاده سازی در یک تراشه را در پی داشته است . به نظر می رسد پردازنده ها ‏‎(programmable architecture)‎‏ امکان حداکثر بهره برداری از این ظرفیتهای جدید را نداشته و نیاز به بازنگری دارند. معماری بازپیکرپذیر ‏‎(reconfigur...

15 صفحه اول

Blocking and backward blocking involve learned inattention

Journal: :Psychonomic Bulletin & Review 2000

متن کامل

Scalable Graph Convolutional Network Training on Distributed-Memory Systems

Journal: :Proceedings of the VLDB Endowment 2022

Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms distributed memory systems necessary. Since the convolution operation induces irregular access patterns, designing a memory- communication-efficient parallel algorithm GCN poses unique challenges. We propose highly t...

متن کامل

A Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints

Journal: International Journal of Modeling, Identification, Simulation and Control 2014

Amin Rezaeian, Arash Deldari, Mahmoud Naghibzadeh, Saeid Abrishami,

One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of sche...

متن کامل

Efficient and Practical Non-Blocking Data Structures

2004

Håkan Sundell

This thesis deals with how to design and implement efficient, practical and reliable concurrent data structures. The design method using mutual exclusion incurs serious drawbacks, whereas the alternative non-blocking techniques avoid those problems and also admit improved parallelism. However, designing non-blocking algorithms is a very complex task, and a majority of the algorithms in the lite...

متن کامل

Vector-thread architecture and implementation

2007

Ronny Krashinsky

This thesis proposes vector-thread architectures as a performance-efficient solution for all-purpose computing. The VT architectural paradigm unifies the vector and multithreaded compute models. VT provides the programmer with a control processor and a vector of virtual processors. The control processor can use vector-fetch commands to broadcast instructions to all the VPs or each VP can use th...

متن کامل