Transparent Artificial Intelligence (AI) systems facilitate understanding of the decision-making process and provide opportunities in various aspects of explaining AI models.
This book introduces readers to emerging persistent memory (PM) technologies that promise the performance of dynamic random-access memory (DRAM) with the durability of traditional storage media, such as hard disks and solid-state drives (SSDs).
This book is intended for designers with experience in traditional (clocked) circuit design, seeking information about asynchronous circuit design, in order to determine if it would be advantageous to adopt asynchronous methodologies in their next design project.
This book constitutes the proceedings of the 33rd International Conference on Parallel and Distributed Computing, Euro-Par 2022, held in GLasgow, UK, in August 2022.
This book constitutes the refereed proceedings of the 14th International Conference on Reversible Computation, RC 2022, which was held in Urbino, Italy, during July 5-6, 2021.
This book constitutes the refereed proceedings of the 37th International Conference on High Performance Computing, ISC High Performance 2022, held in Hamburg, Germany, during May 29 - June 2, 2022.
This book constitutes revised selected papers of the 8th Latin American High Performance Computing Conference, CARLA 2021, held in Guadalajara, Mexico, in October 2021.
As the number of cores on a chip continues to climb, architects will need to address both bandwidth and power consumption issues related to the interconnection network.
This book provides a thorough overview of the state-of-the-art field-programmable gate array (FPGA)-based robotic computing accelerator designs and summarizes their adopted optimized techniques.
This book targets computer scientists and engineers who are familiar with concepts in classical computer systems but are curious to learn the general architecture of quantum computing systems.
With growing interest in computer security and the protection of the code and data which execute on commodity computers, the amount of hardware security features in today's processors has increased significantly over the recent years.
Originally developed to support video games, graphics processor units (GPUs) are now increasingly used for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic currencies.
This book provides computer engineers, academic researchers, new graduate students, and seasoned practitioners an end-to-end overview of virtual memory.
This book describes deep learning systems: the algorithms, compilers, and processor components to efficiently train and deploy deep learning models for commercial applications.
This historical survey of parallel processing from 1980 to 2020 is a follow-up to the authors' 1981 Tutorial on Parallel Processing, which covered the state of the art in hardware, programming languages, and applications.
Most emerging applications in imaging and machine learning must perform immense amounts of computation while holding to strict limits on energy and power.
This book focuses on the core question of the necessary architectural support provided by hardware to efficiently run virtual machines, and of the corresponding design of the hypervisors that run them.
This synthesis lecture presents the current state-of-the-art in applying low-latency, lossless hardware compression algorithms to cache, memory, and the memory/cache link.
Hardware acceleration in the form of customized datapath and control circuitry tuned to specific applications has gained popularity for its promise to utilize transistors more efficiently.
This book aims to achieve the following goals: (1) to provide a high-level survey of key analytics models and algorithms without going into mathematical details; (2) to analyze the usage patterns of these models; and (3) to discuss opportunities for accelerating analytics workloads using software, hardware, and system approaches.
Since the end of Dennard scaling in the early 2000s, improving the energy efficiency of computation has been the main concern of the research community and industry.
The emerging three-dimensional (3D) chip architectures, with their intrinsic capability of reducing the wire length, promise attractive solutions to reduce the delay of interconnects in future microprocessors.
Having hit power limitations to even more aggressive out-of-order execution in processor cores, many architects in the past decade have turned to single-instruction-multiple-data (SIMD) execution to increase single-threaded performance.
As Moore's Law and Dennard scaling trends have slowed, the challenges of building high-performance computer architectures while maintaining acceptable power efficiency levels have heightened.
Since the 1970's, microprocessor-based digital platforms have been riding Moore's law, allowing for doubling of density for the same area roughly every two years.
Shrinking feature size and diminishing supply voltage are making circuits sensitive to supply voltage fluctuations within the microprocessor, caused by normal workload activity changes.
Multithreaded architectures now appear across the entire range of computing devices, from the highest-performing general purpose devices to low-end embedded processors.
General-purpose graphics processing units (GPGPU) have emerged as an important class of shared memory parallel processing architectures, with widespread deployment in every computer class from high-end supercomputers to embedded mobile platforms.
As conventional memory technologies such as DRAM and Flash run into scaling challenges, architects and system designers are forced to look at alternative technologies for building future computer systems.
A key determinant of overall system performance and power dissipation is the cache hierarchy since access to off-chip memory consumes many more cycles and energy than on-chip accesses.
Dynamic binary modification tools form a software layer between a running application and the underlying operating system, providing the powerful opportunity to inspect and potentially modify every user-level guest application instruction that executes.