Hands-On GPU Programming with Python and CUDA Explore High-Performance Parallel Computing with CUDA /: Explore High-Performance Parallel Computing with CUDA. (2018)
- Record Type:
- Book
- Title:
- Hands-On GPU Programming with Python and CUDA Explore High-Performance Parallel Computing with CUDA /: Explore High-Performance Parallel Computing with CUDA. (2018)
- Main Title:
- Hands-On GPU Programming with Python and CUDA Explore High-Performance Parallel Computing with CUDA
- Further Information:
- Note: Brian Tuomanen.
- Authors:
- Tuomanen, Brian
- Contents:
- Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Chapter 1: Why GPU Programming?; Technical requirements; Parallelization and Amdahl's Law; Using Amdahl's Law; The Mandelbrot set; Profiling your code; Using the cProfile module; Summary; Questions; Chapter 2: Setting Up Your GPU Programming Environment; Technical requirements; Ensuring that we have the right hardware; Checking your hardware (Linux); Checking your hardware (windows); Installing the GPU drivers; Installing the GPU drivers (Linux); Installing the GPU drivers (Windows) Setting up a C++ programming environmentSetting up GCC, Eclipse IDE, and graphical dependencies (Linux); Setting up Visual Studio (Windows); Installing the CUDA Toolkit; Installing the CUDA Toolkit (Linux); Installing the CUDA Toolkit (Windows); Setting up our Python environment for GPU programming; Installing PyCUDA (Linux); Creating an environment launch script (Windows); Installing PyCUDA (Windows); Testing PyCUDA; Summary; Questions; Chapter 3: Getting Started with PyCUDA; Technical requirements; Querying your GPU; Querying your GPU with PyCUDA; Using PyCUDA's gpuarray class Transferring data to and from the GPU with gpuarrayBasic pointwise arithmetic operations with gpuarray; A speed test; Using PyCUDA's ElementWiseKernel for performing pointwise computations; Mandelbrot revisited; A brief foray into functional programming; Parallel scan and reduction kernel basics; Summary;Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Chapter 1: Why GPU Programming?; Technical requirements; Parallelization and Amdahl's Law; Using Amdahl's Law; The Mandelbrot set; Profiling your code; Using the cProfile module; Summary; Questions; Chapter 2: Setting Up Your GPU Programming Environment; Technical requirements; Ensuring that we have the right hardware; Checking your hardware (Linux); Checking your hardware (windows); Installing the GPU drivers; Installing the GPU drivers (Linux); Installing the GPU drivers (Windows) Setting up a C++ programming environmentSetting up GCC, Eclipse IDE, and graphical dependencies (Linux); Setting up Visual Studio (Windows); Installing the CUDA Toolkit; Installing the CUDA Toolkit (Linux); Installing the CUDA Toolkit (Windows); Setting up our Python environment for GPU programming; Installing PyCUDA (Linux); Creating an environment launch script (Windows); Installing PyCUDA (Windows); Testing PyCUDA; Summary; Questions; Chapter 3: Getting Started with PyCUDA; Technical requirements; Querying your GPU; Querying your GPU with PyCUDA; Using PyCUDA's gpuarray class Transferring data to and from the GPU with gpuarrayBasic pointwise arithmetic operations with gpuarray; A speed test; Using PyCUDA's ElementWiseKernel for performing pointwise computations; Mandelbrot revisited; A brief foray into functional programming; Parallel scan and reduction kernel basics; Summary; Questions; Chapter 4: Kernels, Threads, Blocks, and Grids; Technical requirements; Kernels; The PyCUDA SourceModule function; Threads, blocks, and grids; Conway's game of life; Thread synchronization and intercommunication; Using the __syncthreads() device function; Using shared memory The parallel prefix algorithmThe naive parallel prefix algorithm; Inclusive versus exclusive prefix; A work-efficient parallel prefix algorithm; Work-efficient parallel prefix (up-sweep phase); Work-efficient parallel prefix (down-sweep phase); Work-efficient parallel prefix - implementation ; Summary; Questions; Chapter 5: Streams, Events, Contexts, and Concurrency; Technical requirements; CUDA device synchronization; Using the PyCUDA stream class; Concurrent Conway's game of life using CUDA streams; Events; Events and streams; Contexts; Synchronizing the current context Manual context creationHost-side multiprocessing and multithreading; Multiple contexts for host-side concurrency; Summary; Questions; Chapter 6: Debugging and Profiling Your CUDA Code; Technical requirements; Using printf from within CUDA kernels; Using printf for debugging; Filling in the gaps with CUDA-C; Using the Nsight IDE for CUDA-C development and debugging; Using Nsight with Visual Studio in Windows; Using Nsight with Eclipse in Linux; Using Nsight to understand the warp lockstep property in CUDA; Using the NVIDIA nvprof profiler and Visual Profiler; Summary; Questions … (more)
- Publisher Details:
- Birmingham : Packt Publishing Ltd
- Publication Date:
- 2018
- Extent:
- 1 online resource (300 p.)
- Subjects:
- 005.275
COMPUTER / General
CUDA (Computer architecture)
Graphics processing units
Python (Computer program language)
Electronic books - Languages:
- English
- ISBNs:
- 1788995228
9781788995221 - Related ISBNs:
- 1788993918
9781788993913 - Access Rights:
- Legal Deposit; Only available on premises controlled by the deposit library and to one user at any one time; The Legal Deposit Libraries (Non-Print Works) Regulations (UK).
- Access Usage:
- Restricted: Printing from this resource is governed by The Legal Deposit Libraries (Non-Print Works) Regulations (UK) and UK copyright law currently in force.
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD.DS.364357
- Ingest File:
- 02_343.xml