复旦大学:《计算机原理 Computer System》习题PPT课件_chapter5 Optimizing Program Performance

UNID 05-200 Chapter 5 Optimizing Program Performance Guobao Jiang(蒋国宝) 11210240049@fudan.edu.cn loveaborn@foxmail.com
Chapter 5 Optimizing Program Performance Guobao Jiang (蒋国宝) 11210240049@fudan.edu.cn loveaborn@foxmail.com

UN/ Problem 5. 1(P381) The following problem illustrates the way memory aliasing(存储器别名使用) can cause unexpected program behavior. Consider the following procedure to swap two values void swap (int *xp, int *yp) Xp 六 ★ Ⅺp+"yp; 六 X+V* yp=xp -ypi 大 x+y-y=X Ⅺ=x-yp y-×=y f this procedure is called with xp equal to yp what effect will it have 2021/10/29
Problem 5.1 (P381) • The following problem illustrates the way memory aliasing (存储器别名使用) can cause unexpected program behavior. Consider the following procedure to swap two values: void swap(int *xp, int *yp) { *xp = *xp + *yp; /* x+y */ *yp = *xp - *yp; /* x+y-y = x */ *xp = *xp - *yp; /* x+y-x = y */ } If this procedure is called with xp equal to yp, what effect will it have ? 2021/10/29 2

UN/ Problem 5.2(P384) Later in this chapter we will take a single function and generate many different variants that preserve the function s behavior but with different performance characteristics. For three of these variants we found that the run times(in clock cycles)can be approximated by y the following functions: Version 1 60+ 35n · Version2136+4n · Version3157+1.25n For what values of n would each version be the fastest of the three? Remember that n wil always be an integer 2021/10/29
Problem 5.2 (P384) • Later in this chapter we will take a single function and generate many different variants that preserve the function’s behavior, but with different performance characteristics. For three of these variants, we found that the run times (in clock cycles) can be approximated by the following functions: • Version 1 60 + 35n • Version 2 136 + 4n • Version 3 157 + 1.25n For what values of n would each version be the fastest of the three ? Remember that n will always be an integer. 2021/10/29 3

UN/ H Problem 5.3(P391 Consider the following functions int min(int x, int y) returnx<y? x: y] int max(int x, int return x<y? y: x void incr(int*xp, int vxp+=v;] nt square(in×) return×*×;} The following three code fragments call these functions 2021/10/29
Problem 5.3 (P391) • Consider the following functions: int min(int x, int y) {return x < y ? x:y;} int max(int x, int y){return x < y ? y:x;} void incr(int *xp, int v){ *xp += v;} int square(int x){return x*x;} • The following three code fragments call these functions: 2021/10/29 4

UN/ ty Problem 5.3+(P391) A. for(i= min(x, y); k=min(x, y): incr (&i, 1) t+= square(i) C. int low min(x,y) int high max(x, y) for (i= low; i< high; incr(&i, 1)) t+= square) 2021/10/29
Problem 5.3+ (P391) A. for(i = min(x, y); i=min(x, y);incr(&i,-1)) t += square(i); C. int low = min(x, y); int high = max(x, y); for (i = low; i < high; incr(&i, 1)) t+= square(i); 2021/10/29 5

UN ty Problem 5.3+(P391) A. for(i= min(x, y) imax(, y); incr (&i, 1) t += square (i C. int low min(x, y): int high max(x,y) for( i= low: i< high; incr &i, 1)) t+= square (: Assume x=10 and y=100. Fill in the table ode min Incr square A 1 91 90 90 B. 91 90 90 90 90 2021/10/29
1 1 90 90 91 1 90 90 1 91 90 90 Problem 5.3+ (P391) A. for(i = min(x, y); i<max(x, y); incr(&i,1)) t += square(i); C. int low = min(x, y); int high = max(x, y); for (i = low; i < high; incr(&i, 1)) t+= square(i); Assume x=10 and y=100. Fill in the table: 2021/10/29 6 Code min max incr square A. B. C

UN/ Problem 5. 4(P415 At times, Gcc does its own…i· Question L6 Write C code for a addl (%eax), %edx add4(%ea×),%edx procedure combine5px8 addl 8 ( %eax), %edx that shows how pointers loop variables, and addl 12(%eax), %edx termination conditions are ad16(%ea×);%edx being computed by this add2o(‰ea×),%edx code. show the general addl 24(%eax), % edx form with arbitrary data addl 28 (%eax), % edx addl $32,%eax and combining operation in i the style of figure 5. 19 addl 8,ecx (P392). Describe how计 cmpl %esi, %ecx differs form our j.L6 handwritten pointer code (Figure 5. 22) 2021/10/29
Problem 5.4 (P415) • At times, GCC does its own … .L6 addl (%eax), %edx addl 4(%eax),%edx addl 8(%eax),%edx addl 12(%eax),%edx addl 16(%eax),%edx addl 20(%eax),%edx addl 24(%eax),%edx addl 28(%eax),%edx addl $32,%eax addl $ 8,%ecx cmpl %esi, %ecx jl .L6 2021/10/29 7 • Question: Write C code for a procedure combine5px8 that shows how pointers, loop variables, and termination conditions are being computed by this code. Show the general form with arbitrary data and combining operation in the style of Figure 5.19 (P392). Describe how it differs form our handwritten pointer code (Figure 5.22)

UN/ Problem 5.5( P421 The following shows the code generated from a variant of combine(P416)that uses eight-way loop unrolling and four-way parallelism L152 addl (%eax), %ecx add 4(%eax), %esi add 8(%eax), %edi Questions: addl 12(%eax), %ebx addl 16(%eax),%ecx:A. What program variable has addl 20(%eax), %esi being spilled onto the stack ? addl 24(%eax), /edi B. At what location on the addl28(%ea×),%ebx addI 32 , %eax stack? addI 8, %edx C. Why is this a good choice cmp|-8(‰ebp),%edx of which value to spill? jI. L152 2021/10/29
Problem 5.5 (P421) • The following shows the code generated from a variant of combine6(P416) that uses eight-way loop unrolling and four-way parallelism. .L152 addl (%eax), %ecx addl 4(%eax), %esi addl 8(%eax), %edi addl 12(%eax), %ebx addl 16(%eax), %ecx addl 20(%eax), %esi addl 24(%eax), %edi addl 28(%eax), %ebx addl $32,%eax addl $ 8,%edx cmpl -8(%ebp), %edx jl .L152 2021/10/29 8 • Questions: A. What program variable has being spilled onto the stack? B. At what location on the stack? C. Why is this a good choice of which value to spill ?

UN/ Problem 5.6(P422) Consider the following function for computing the product of an array of n integers we have unrolled the loop by a factor of 3 int aprod(int al], int n) Int 1, X, y, Z: int r= 1, fo(=0:i<n-2;i+3) 三a[i:y=a[+1]:z=a[i+2] r=r*x*y'z;/*Product computation?/ for ci<n: i++) r x=a[i] return ri 2021/10/29
Problem 5.6 (P422) • Consider the following function for computing the product of an array of n integers. We have unrolled the loop by a factor of 3. int aprod(int a[], int n) { int i, x, y, z; int r = 1; for (i = 0; i < n-2; i += 3 ){ x = a[i]; y=a[i+1]; z=a[i+2]; r = r*x*y*z; /*Product computation*/ } for (; i < n; i++) r *= a[i]; return r; } 2021/10/29 9

UN/ H Problem 5.6+(P422) For the line labeled Product computation, we can use parentheses to create five different associations of the computation as follows: (r*×)*y)*z;A1*/ r=(*(X*y)*z;A2*/ r=r*(X*y)*2)/A3* r=r*(x*(y*z):/A4*/ ×)*(y*z):/"A5*/ Recall from Figure 5. 12 that the integer multiplication operation on this machine has a latency of 4 cycles and an issue time of 1 cycle 2021/10/29
Problem 5.6+ (P422) • For the line labeled Product computation, we can use parentheses to create five different associations of the computation, as follows: r = ((r * x) * y) * z; /* A1 */ r = (r * (x * y)) * z; /* A2 */ r = r * ((x * y) * z); /* A3 */ r = r * (x * (y * z)); /* A4 */ r = (r * x) * (y * z); /* A5 */ • Recall from Figure 5.12 that the integer multiplication operation on this machine has a latency of 4 cycles and an issue time of 1 cycle. 2021/10/29 10
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter4 Processor Architecture.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level(2)Representation of Programs.ppt
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level Representation of Programs.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level Representation of Programs.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter2.pptx
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Linking II(• Static linking • Symbols & Symbol Table • Relocation • Executable Object Files • Loading).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Dynamic Memory Allocation(• Implementation of a simple allocator • Explicit Free List • Segregated Free List).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Virtual Memory(• Multilevel page tables • Different points of view • Pentium/Linux Memory System • Memory Mapping).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Virtual Memory(• Virtual Space• Address translation • Accelerating translation• Different points of view).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Cache Memory(• Cache mountain • Matrix multiplication).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Cache Memory(• General concepts • 3 ways to organize cache memory • Issues with writes • Write cache friendly codes • Cache mountain).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Memory Hierarchy(• Random-Access Memory(RAM)• Nonvolatile Memory • Disk Storage • Locality • Memory hierarchy).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Hardware Organization.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_13 Code Optimization(• Optimizing Blockers • Understanding Modern Processor • More Code Optimization techniques • Performance Tuning).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_12b Code Optimization(• Machine-Independent Optimization – Code motion – Memory optimization • Suggested reading).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Pipelined Implementation Part II.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Pipelined Implementation Part I.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_09、10 Sequential CPU Implementation.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Processor Architecture.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Heterogeneous Data Structures & Alignment; Putting it Together; Floating Point.ppt
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter6 The Memory Hierarchy.ppt
- 复旦大学:《计算机网络与网页制作》课程教学大纲 Computer Network and Webpage Design.pdf
- 《当代教育理论与实践》论文:大学计算机基础教学实践与思考(复旦大学:肖川、张向东).pdf
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)01 计算机网络基础.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)02 两类基本网络(局域网、无线局域网).pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)03 因特网基础知识.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)04 传统的Internet服务.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)05 新型的Internet应用.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)06 Internet安全.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)07 Dreamweaver CS5入门.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)08 创建一个新站点.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)09 添加文本和图像.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)10 用CSS设定页面样式.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)11 用CSS作页面布局.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)12 页面布局高级技术.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)13 使用表格.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)14 添加动画、视频和声音.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)15 使用模块化技术加速网页制作.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)16 构建网页表单.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)17 使用Spry组件.pptx