OpenACC Programming and Best Practices Guide
latest
Introduction
Accelerating an Application with OpenACC
Assess Application Performance
Parallelize Loops
Optimize Data Locality
Optimize Loops
OpenACC Interoperability
Advanced OpenACC Features
OpenACC Programming and Best Practices Guide
OpenACC Programming and Best Practices Guide
Edit on GitHub
OpenACC Programming and Best Practices Guide
Introduction
Writing Portable Code
What is OpenACC?
Accelerating an Application with OpenACC
OpenACC Directive Syntax
Porting Cycle
Heterogenous Computing Best Practices
Case Study - Jacobi Iteration
Assess Application Performance
Baseline Profiling
Additional Profiling
Case Study - Analysis
Parallelize Loops
The Kernels Construct
The Parallel Construct
Differences Between Parallel and Kernels
The Loop Construct
Routine Directive
Atomic Operations
Case Study - Parallelize
Optimize Data Locality
Data Regions
Data Clauses
Unstructured Data Lifetimes
Update Directive
Best Practice: Offload Inefficient Operations to Maintain Data Locality
Case Study - Optimize Data Locality
Optimize Loops
Efficient Loop Ordering
OpenACC’s 3 Levels of Parallelism
Mapping Parallelism to the Hardware
Collapse Clause
Routine Parallelism
Case Study - Optimize Loops
OpenACC Interoperability
The Host Data Region
Using Device Pointers
Obtaining Device and Host Pointer Addresses
Additional Vendor-Specific Interoperability Features
Advanced OpenACC Features
Asynchronous Operation
Multi-device Programming
Read the Docs
v: latest
Versions
latest
stable
Downloads
On Read the Docs
Project Home
Builds