《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)OpenMP Application Programming Interface Examples Version 4.0.2

OpenMP OpenMP Application Programming Interface Examples Version 4.0.2-March 2015 Source codes for OpenMP 4.0.2 Examples can be downloaded from github. Copyright 1997-2015 OpenMP Architecture Review Board. Permission to copy without fee all or part of this material is granted,provided the OpenMP
OpenMP Application Programming Interface Examples Version 4.0.2 – March 2015 Source codes for OpenMP 4.0.2 Examples can be downloaded from github. Copyright c 1997-2015 OpenMP Architecture Review Board. Permission to copy without fee all or part of this material is granted, provided the OpenMP Architecture Review Board copyright notice and the title of this document appear. Notice is given that copying is by permission of OpenMP Architecture Review Board

Contents 1 A Simple Parallel Loop 2 The OpenMP Memory Model 3 Conditional Compilation 10 4 Internal Control Variables(ICVs) 11 5 The parallel Construct 6 Controlling the Number of Threads on Multiple Nesting Levels 公 7 Interaction Between the num_threads Clause and omp_set_dynamic 20 8 The proc_bind Clause 22 8.1 Spread Affinity Policy. ·。”。·”·,··4·4”·。。·。···◆ 22 8.2 Close Affinity Policy 25 83 Master Affinity Policy.....................,...,...... 0 9 Fortran Restrictions on the do Construct 29 10 Fortran Private Loop Iteration Variables 吗 11 The nowait Clause 33 12 The collapse Clause 37 13 The parallel sections Construct 41 14 The firstprivate Clause and the sections Construct 43 15 The single Construct
Contents 1 A Simple Parallel Loop 3 2 The OpenMP Memory Model 4 3 Conditional Compilation 10 4 Internal Control Variables (ICVs) 11 5 The parallel Construct 14 6 Controlling the Number of Threads on Multiple Nesting Levels 17 7 Interaction Between the num_threads Clause and omp_set_dynamic 20 8 The proc_bind Clause 22 8.1 Spread Affinity Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 8.2 Close Affinity Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 8.3 Master Affinity Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 9 Fortran Restrictions on the do Construct 29 10 Fortran Private Loop Iteration Variables 31 11 The nowait Clause 33 12 The collapse Clause 37 13 The parallel sections Construct 41 14 The firstprivate Clause and the sections Construct 43 15 The single Construct 45 i

16 The task and taskwait Constructs 47 17 Task Dependences 66 17.1 Flow Dependence 66 17.2 Anti-dependence 6 17.3 Output Dependence... 68 l7.4 Concurrent Execution with Dependences.······················ 70 17.5 Matrix multiplication 71 18 The taskgroup Construct 73 19 The taskyield Construct 76 20 The workshare Construct 78 21 The master Construct 82 22 The critical Construct 84 23 Worksharing Constructs Inside a critical Construct 86 24 Binding of barrier Regions 88 25 The atomic Construct 91 26 Restrictions on the atomic Construct % 27 The flush Construct without a List 102 28 Placement of flush,barrier,taskwait and taskyield Directives 106 29 The ordered Clause and the ordered Construct 110 30 Cancellation Constructs 114 31 The threadprivate Directive 119 32 Parallel Random Access Iterator Loop 125 33 Fortran Restrictions on shared and private Clauses with Common Blocks 126 ii OpenMP Examples Version 4.0.2-March 2015
16 The task and taskwait Constructs 47 17 Task Dependences 66 17.1 Flow Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 17.2 Anti-dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 17.3 Output Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 17.4 Concurrent Execution with Dependences . . . . . . . . . . . . . . . . . . . . . . . 70 17.5 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 18 The taskgroup Construct 73 19 The taskyield Construct 76 20 The workshare Construct 78 21 The master Construct 82 22 The critical Construct 84 23 Worksharing Constructs Inside a critical Construct 86 24 Binding of barrier Regions 88 25 The atomic Construct 91 26 Restrictions on the atomic Construct 98 27 The flush Construct without a List 102 28 Placement of flush, barrier, taskwait and taskyield Directives 106 29 The ordered Clause and the ordered Construct 110 30 Cancellation Constructs 114 31 The threadprivate Directive 119 32 Parallel Random Access Iterator Loop 125 33 Fortran Restrictions on shared and private Clauses with Common Blocks 126 ii OpenMP Examples Version 4.0.2 - March 2015

34 The default(none)Clause 129 35 Race Conditions Caused by Implied Copies of Shared Variables in Fortran 131 36 The private Clause 133 37 Fortran Restrictions on Storage Association with the private Clause 137 38 C/C++Arrays in a firstprivate Clause 140 39 The lastprivate Clause 142 40 The reduction Clause 144 41 The copyin Clause 150 42 The copyprivate Clause 152 43 Nested Loop Constructs 157 44 Restrictions on Nesting of Regions 160 45 The omp_set_dynamic and omp_set_num_threads Routines 167 46 The omp_get_num_threads Routine 169 47 The omp_init_lock Routine 172 48 Ownership of Locks 174 49 Simple Lock Routines 176 50 Nestable Lock Routines 179 51 SIMD Constructs 182 52 target Construct 193 52.1 target Construct on parallel Construct 193 52.2 target Construct with map Clause .....194 Contents
34 The default(none) Clause 129 35 Race Conditions Caused by Implied Copies of Shared Variables in Fortran 131 36 The private Clause 133 37 Fortran Restrictions on Storage Association with the private Clause 137 38 C/C++ Arrays in a firstprivate Clause 140 39 The lastprivate Clause 142 40 The reduction Clause 144 41 The copyin Clause 150 42 The copyprivate Clause 152 43 Nested Loop Constructs 157 44 Restrictions on Nesting of Regions 160 45 The omp_set_dynamic and omp_set_num_threads Routines 167 46 The omp_get_num_threads Routine 169 47 The omp_init_lock Routine 172 48 Ownership of Locks 174 49 Simple Lock Routines 176 50 Nestable Lock Routines 179 51 SIMD Constructs 182 52 target Construct 193 52.1 target Construct on parallel Construct . . . . . . . . . . . . . . . . . . . . 193 52.2 target Construct with map Clause . . . . . . . . . . . . . . . . . . . . . . . . . 194 Contents iii

52.3 map Clause with to/from map-types........................195 52.4 map Clause with Array Sections .......................... 197 52.5 target Construct with if Clause 198 53 target data Construct 200 53.1 Simple target data Construct 200 53.2 target data Region Enclosing Multiple target Regions.·········· 201 53.3 target data Construct with Orphaned Call...........·.····... 204 53.4 target data Construct with if Clause 208 54 target update Construct 212 54.1 Simple target data and target update Constructs 212 54.2 target update Construct with if Clause 214 55 declare target Construct 216 55.1 declare target and end declare target for a Function 216 55.2 declare target Construct for Class Type 218 55.3 declare target and end declare target for Variables·········: 219 55.4 declare target and end declare target with declare simd 222 56 teams Constructs 224 56.1 target and teams Constructs with omp_get_num_teams and omp_get_team_num Routines·········.·····,。,.,,·.· 224 56.2 target,teams,and distribute Constructs·,··. 226 56.3 target teams,and Distribute Parallel Loop Constructs·········· 227 56.4 target teams and Distribute Parallel Loop Constructs with Scheduling Clauses 229 56.5 target teams and distribute simd Constructs............... 230 56.6 target teams and Distribute Parallel Loop SIMD Constructs.......... 232 57 Asynchronous Execution of a target Region Using Tasks 233 58 Array Sections in Device Constructs 238 59 Device Routines 243 59.I omp_is_initialdevice Routine....................... 243 59.2omp_get_num_devices Routine..··········· 245 iv OpenMP Examples Version 4.0.2-March 2015
52.3 map Clause with to/from map-types . . . . . . . . . . . . . . . . . . . . . . . . 195 52.4 map Clause with Array Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 52.5 target Construct with if Clause . . . . . . . . . . . . . . . . . . . . . . . . . 198 53 target data Construct 200 53.1 Simple target data Construct . . . . . . . . . . . . . . . . . . . . . . . . . . 200 53.2 target data Region Enclosing Multiple target Regions . . . . . . . . . . . . 201 53.3 target data Construct with Orphaned Call . . . . . . . . . . . . . . . . . . . . 204 53.4 target data Construct with if Clause . . . . . . . . . . . . . . . . . . . . . . 208 54 target update Construct 212 54.1 Simple target data and target update Constructs . . . . . . . . . . . . . 212 54.2 target update Construct with if Clause . . . . . . . . . . . . . . . . . . . . 214 55 declare target Construct 216 55.1 declare target and end declare target for a Function . . . . . . . . . . 216 55.2 declare target Construct for Class Type . . . . . . . . . . . . . . . . . . . . 218 55.3 declare target and end declare target for Variables . . . . . . . . . . 219 55.4 declare target and end declare target with declare simd . . . . . 222 56 teams Constructs 224 56.1 target and teams Constructs with omp_get_num_teams and omp_get_team_num Routines . . . . . . . . . . . . . . . . . . . . . . . . 224 56.2 target, teams, and distribute Constructs . . . . . . . . . . . . . . . . . . 226 56.3 target teams, and Distribute Parallel Loop Constructs . . . . . . . . . . . . . 227 56.4 target teams and Distribute Parallel Loop Constructs with Scheduling Clauses 229 56.5 target teams and distribute simd Constructs . . . . . . . . . . . . . . . 230 56.6 target teams and Distribute Parallel Loop SIMD Constructs . . . . . . . . . . 232 57 Asynchronous Execution of a target Region Using Tasks 233 58 Array Sections in Device Constructs 238 59 Device Routines 243 59.1 omp_is_initial_device Routine . . . . . . . . . . . . . . . . . . . . . . . 243 59.2 omp_get_num_devices Routine . . . . . . . . . . . . . . . . . . . . . . . . . 245 iv OpenMP Examples Version 4.0.2 - March 2015

59.3 omp_set_default_device and omp_get_default_device Routines....·.·.·.,·· 246 60 Fortran ASSocIATE Construct 248 A Document Revision History 250 A.1 Changes from 4.0.1 to 4.0.2 ,。。。·,。,,,。··。÷··。·。。。。。·… 250 A.2 Changes from 4.0 to 4.0.1 250 A.3 Changes from 3.1 to 4.0. 。。。。。。 250 Contents
59.3 omp_set_default_device and omp_get_default_device Routines . . . . . . . . . . . . . . . . . . . . . . 246 60 Fortran ASSOCIATE Construct 248 A Document Revision History 250 A.1 Changes from 4.0.1 to 4.0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 A.2 Changes from 4.0 to 4.0.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 A.3 Changes from 3.1 to 4.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Contents v

Introduction 2 This collection of pr 34 mming examples supplments the OpenMPPIor Shad with theopeMincaotheoa conventions used in that document 56 Note-This first release of the OpenMP Examples reflects the OpenMP Version 4.0 specifications. Additional examples are being developed and will be published in future releases of this document. The OpenMP API specification provides a model for parallel programming that is portable across 8 shared memory architectures from different vendors.Compilers from numerous vendors support the OpenMP API. 101 The directives,library routines.and environment variables demonstrated in this document allow users to create and manage parallel programs while permitting portability.The directives extend the 12 C,C++and Fortran base languages with single program multiple data(SPMD)constructs,tasking 13 constructs,device constructs,worksharing constructs,and synchronization constructs,and they 14 provide support for sharing and privatizing data.The functionality to control the runtime 15 167 18 The latest source codes for OpenMP Examples can be downloaded from the sources directory at 9 https://github.com/OpenMP/Examples.The codes for this OpenMP 4.0.2 Examples document have 20 the tag v4.0.2. 222 tion bout the OpenMP API anda list of the comilers that suppor the OpenMP 23 http://www.openmp.org 1
1 Introduction 2 This collection of programming examples supplements the OpenMP API for Shared Memory 3 Parallelization specifications, and is not part of the formal specifications. It assumes familiarity 4 with the OpenMP specifications, and shares the typographical conventions used in that document. 5 Note – This first release of the OpenMP Examples reflects the OpenMP Version 4.0 specifications. 6 Additional examples are being developed and will be published in future releases of this document. 7 The OpenMP API specification provides a model for parallel programming that is portable across 8 shared memory architectures from different vendors. Compilers from numerous vendors support 9 the OpenMP API. 10 The directives, library routines, and environment variables demonstrated in this document allow 11 users to create and manage parallel programs while permitting portability. The directives extend the 12 C, C++ and Fortran base languages with single program multiple data (SPMD) constructs, tasking 13 constructs, device constructs, worksharing constructs, and synchronization constructs, and they 14 provide support for sharing and privatizing data. The functionality to control the runtime 15 environment is provided by library routines and environment variables. Compilers that support the 16 OpenMP API often include a command line option to the compiler that activates and allows 17 interpretation of all OpenMP directives. 18 The latest source codes for OpenMP Examples can be downloaded from the sources directory at 19 https://github.com/OpenMP/Examples. The codes for this OpenMP 4.0.2 Examples document have 20 the tag v4.0.2. 21 Complete information about the OpenMP API and a list of the compilers that support the OpenMP 22 API can be found at the OpenMP.org web site 23 http://www.openmp.org 1

Examples 2 The following are examples of the OpenMP API directives,constructs,and routines. A C/C++ A statement following a directive is compound only when necessary,and a non-compound statement is indented with respect to a directive preceding it. C/C++
1 Examples 2 The following are examples of the OpenMP API directives, constructs, and routines. C / C++ 3 A statement following a directive is compound only when necessary, and a non-compound 4 statement is indented with respect to a directive preceding it. C / C++ 2

1 CHAPTER1 2 A Simple Parallel Loop The following example demonstrates how to parallelize a simple loop using the parallel loop g construct.The loop iteration variable is private by default,so it is not necessary to specify it explicitly in a private clause. C/C++ 6 Example ploop.Ic void simple(int n,float ta,float +b) int i; ttra-P20 ate by defa电 1 C/C++ Fortran Example ploop.If 5 SUBROUTINE SIMPLE(N,A,B) -3 INTEGER I,N REAL B(N),A(N) s-5 S-6 !SOMP PARALLEL DO !I is private by default S-7 DO I=2,N S-8 B(I)=(a(I)+A(I-1)/2.0 S-9 ENDDO S-10 !SOMP END PARALLEL DO S-11 S-12 END SUBROUTINE SIMPLE Fortran 3
1 CHAPTER 1 2 A Simple Parallel Loop 3 The following example demonstrates how to parallelize a simple loop using the parallel loop 4 construct. The loop iteration variable is private by default, so it is not necessary to specify it 5 explicitly in a private clause. C / C++ 6 Example ploop.1c S-1 void simple(int n, float *a, float *b) S-2 { S-3 int i; S-4 S-5 #pragma omp parallel for S-6 for (i=1; i<n; i++) /* i is private by default */ S-7 b[i] = (a[i] + a[i-1]) / 2.0; S-8 } C / C++ Fortran 7 Example ploop.1f S-1 SUBROUTINE SIMPLE(N, A, B) S-2 S-3 INTEGER I, N S-4 REAL B(N), A(N) S-5 S-6 !$OMP PARALLEL DO !I is private by default S-7 DO I=2,N S-8 B(I) = (A(I) + A(I-1)) / 2.0 S-9 ENDDO S-10 !$OMP END PARALLEL DO S-11 S-12 END SUBROUTINE SIMPLE Fortran 3

1 CHAPTER2 The OpenMP Memory Model In the following example,at Print 1,the value of x could be either 2 or 5,depending on the timing g of the threads,and the implementation of the assignment to x.There are two reasons that the value at Print I might not be 5.First,Print I might be executed before the assignment to x is executed. 67 ment.the value5 is not guaranteed to be seen by 8 The barrier after Print I contains implicit flushes on all threads,as well as a thread synchronization. so the programmer is guaranteed that the value 5 will be printed by both Print 2 and Print 3. C/C++ 10 Example mem_model.Io #include #include int main()( int x; x=2; pragma omp parallel num_threads(2)shared(x) if (omp- -get_thread_num()=-0)( 85 fol ng read Thread#sd:x 819 S-18 s-19 #pragma omp barrier s-20 if (omp_get_thread_num()==0){
1 CHAPTER 2 2 The OpenMP Memory Model 3 In the following example, at Print 1, the value of x could be either 2 or 5, depending on the timing 4 of the threads, and the implementation of the assignment to x. There are two reasons that the value 5 at Print 1 might not be 5. First, Print 1 might be executed before the assignment to x is executed. 6 Second, even if Print 1 is executed after the assignment, the value 5 is not guaranteed to be seen by 7 thread 1 because a flush may not have been executed by thread 0 since the assignment. 8 The barrier after Print 1 contains implicit flushes on all threads, as well as a thread synchronization, 9 so the programmer is guaranteed that the value 5 will be printed by both Print 2 and Print 3. C / C++ 10 Example mem_model.1c S-1 #include S-2 #include S-3 S-4 int main(){ S-5 int x; S-6 S-7 x = 2; S-8 #pragma omp parallel num_threads(2) shared(x) S-9 { S-10 S-11 if (omp_get_thread_num() == 0) { S-12 x = 5; S-13 } else { S-14 /* Print 1: the following read of x has a race */ S-15 printf("1: Thread# %d: x = %d\n", omp_get_thread_num(),x ); S-16 } S-17 S-18 #pragma omp barrier S-19 S-20 if (omp_get_thread_num() == 0) { 4
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)OpenMP Application Programming Interface Version 5.0.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)OpenMP Application Program Interface Version 4.0.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)OpenMP API 5.0.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)OpenMP API 4.0.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)04 OpenMP并行编程(一)并行编程介绍、并行域与工作共享.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)应用 - 矩阵乘积的快速算法.pdf
- 华东师范大学:《C语言程序设计》课程教学资源(学习笔记)C语言程序设计学习笔记.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)C Reference Card.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)03 C语言编程介绍.pdf
- Linux操作系统《Linux就该这么学》书籍电子版(第2版)Linux就该这么学(刘遄,2021).pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)Linux操作系统 - Vim Cheat Sheet for Programmers.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)Linux操作系统 - VIM命令小结.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)Linux操作系统 - Vi简介(visual interface).pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)Linux操作系统 - Linux Command Quick Reference.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)02 Linux操作系统介绍.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)01 并行计算介绍 Parallel and High Performance Computing(主讲:潘建瑜).pdf
- 兰州交通大学:《单片机原理与接口技术》课程授课教案(打印版)第十二章 单片机的其它接口.pdf
- 兰州交通大学:《单片机原理与接口技术》课程授课教案(打印版)第十一章 单片机与I2C总线芯片的接口(2/2)单片机与PCF8591接口.pdf
- 兰州交通大学:《单片机原理与接口技术》课程授课教案(打印版)第十一章 单片机与I2C总线芯片的接口(1/2).pdf
- 兰州交通大学:《单片机原理与接口技术》课程授课教案(打印版)第十章 MCS-51与DA、AD的接口.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)04 OpenMP并行编程(二)工作共享结构、同步与数据环境.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)04 OpenMP并行编程(三)运行库函数、环境变量.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)05 矩阵 - 向量乘积并行算法(OpenMP).pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)05 矩阵 - 矩阵乘积并行算法(OpenMP).pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)06 线性方程组直接法并行计算.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)07 消息传递编程接口 MPI(一)编程基础.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)MPI - A Message-Passing Interface Standard Version 3.1.pdf
- 《高等数值分析(高性能计算/并行计算)》课程教学资源(参考资料)MPI - A Message-Passing Interface Standard Version 4.0.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)07 消息传递编程接口MPI(二)消息传递.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)08 矩阵向量乘积并行算法(基于MPI).pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)08 矩阵矩阵乘积并行算法(基于MPI).pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)07 消息传递编程接口MPI(三)MPI 数据类型.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)07 消息传递编程接口MPI(四)进程与通信器操作.pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)09 线性方程组并行直接法(基于 MPI).pdf
- 华东师范大学:《高等数值分析(高性能计算/并行计算)》课程教学资源(讲义)10 二维Poisson方程的并行求解算法(基于MPI).pdf
- 大连大学:信息与计算科学专业课程教学大纲汇编(2010).doc
- 大连大学:物理学(多媒体与网络技术)专业课程教学大纲汇编(2010).doc
- 大连大学:计算机科学与技术专业课程教学大纲汇编(2010).doc
- 石河子大学:《编译原理》课程教学资源(教案讲义)编译原理教案 Principle of Compiler(负责人:张丽).doc
- 石河子大学:《编译原理》课程教学资源(试卷习题)第一套.doc