复旦大学:《计算机原理 Computer System》课程PPT课件_13 Code Optimization(• Optimizing Blockers • Understanding Modern Processor • More Code Optimization techniques • Performance Tuning)

Code Optimization
1 Code Optimization

Outline Optimizing blockers Memory alias Side effect in function call Understanding modern Processor Super-scalar Out-of -order execution More Code Optimization techniques Performance Tuning Suggested reading 5.1.57~516
2 Outline • Optimizing Blockers – Memory alias – Side effect in function call • Understanding Modern Processor – Super-scalar – Out-of –order execution • More Code Optimization techniques • Performance Tuning • Suggested reading – 5.1, 5.7 ~ 5.16

5.1 Capabilities and Limitations of Optimizing Compliers Review on 5.3 Program Example 5.4 Eliminating Loop Inefficiencies 5.5 Reducing Procedure Calls 5.6 Eliminating Unneeded Memory References
3 5.1 Capabilities and Limitations of Optimizing Compliers Review on 5.3 Program Example 5.4 Eliminating Loop Inefficiencies 5.5 Reducing Procedure Calls 5.6 Eliminating Unneeded Memory References

Example P387 void combinel(vec ptr v, data t *dest) int ii dest=工DENT; for(i=0;主< vec length(v);立++){ int val get vec element(v, i, &val)i dest=★ dest oper va1;
4 void combine1(vec_ptr v, data_t *dest) { int i; *dest = IDENT; for (i = 0; i < vec_length(v); i++) { int val; get_vec_element(v, i, &val); *dest = *dest OPER val; } } Example P387

Example P388 void combine(vec ptr v, int *dest) int ii int length vec length(v)i ★dest=工DENT; f。r(立=0;i<1 ength;i++){ int val get vec element(v, l, &val)i ★dest=★ dest oper va1;
5 void combine2(vec_ptr v, int *dest) { int i; int length = vec_length(v); *dest = IDENT; for (i = 0; i < length; i++) { int val; get_vec_element(v, i, &val); *dest = *dest OPER val; } } Example P388

Example P392 void combine (vec ptr v, int *dest) int ii int length vec length (v)i int *data get vec start(v)i ★dest=TDEN; f。r(立=0;主<1 ength;i++) *dest *dest oper datalil;
6 void combine3(vec_ptr v, int *dest) { int i; int length = vec_length(v); int *data = get_vec_start(v); *dest = IDENT; for (i = 0; i < length; i++) { *dest = *dest OPER data[i]; } Example P392

Example P394 void combine(vec ptr v int *dest) int ii int length vec length(v)i int *data get vec start(v)i intx=工DENT; for (i=0;i< length; i++) x OpeR datalili ★dest Xi
7 void combine4(vec_ptr v, int *dest) { int i; int length = vec_length(v); int *data = get_vec_start(v); int x = IDENT; for (i = 0; i < length; i++) x = x OPER data[i]; *dest = x; } Example P394

Machine Independent Opt Results Optimizations Reduce function calls and memory references within loop
8 Machine Independent Opt. Results • Optimizations – Reduce function calls and memory references within loop

Machine Independent Opt Results Method Integer Floating Point Abstract -g Combine 42.06 4186 4144 n6000 143.00 P385 Abstract-02 Combine1 31.25 33.25 3125 Move vec_length combine 22.61 21.25 21.15 135.00 P388 P392 data access Combine 6.00 9.00 8.00 11700 Accum in temp Combine 2.00 4.00 3.00 500/394 Performance Anomaly Compl uting FP product of all elements exceptionally slow Very large speedup when accumulate in temporary Memory uses 64-bit format register use 80 Benchmark data caused overflow of 64 bits but not 80
9 Machine Independent Opt. Results • Performance Anomaly – Computing FP product of all elements exceptionally slow. – Very large speedup when accumulate in temporary – Memory uses 64-bit format, register use 80 – Benchmark data caused overflow of 64 bits, but not 80 Method Integer Floating Point + * + * Abstract -g 42.06 41.86 41.44 160.00 Abstract -O2 31.25 33.25 31.25 143.00 Move vec_length 22.61 21.25 21.15 135.00 data access 6.00 9.00 8.00 117.00 Accum. in temp Combine4 2.00 4.00 3.00 5.00 Combine3 Combine2 Combine1 Combine1 P385 P388 P392 P394

Optimization blockers P394 void combine(vec ptr v, int *dest) int i int length vec length(v) int *data get vec start(v)i int sum =0; f。r(主=0;i<1 ength;i++) sum + data[i]i ★dest=sum;
10 Optimization Blockers P394 void combine4(vec_ptr v, int *dest) { int i; int length = vec_length(v); int *data = get_vec_start(v); int sum = 0; for (i = 0; i < length; i++) sum += data[i]; *dest = sum; }
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 复旦大学:《计算机原理 Computer System》课程PPT课件_12b Code Optimization(• Machine-Independent Optimization – Code motion – Memory optimization • Suggested reading).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Pipelined Implementation Part II.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Pipelined Implementation Part I.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_09、10 Sequential CPU Implementation.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Processor Architecture.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Heterogeneous Data Structures & Alignment; Putting it Together; Floating Point.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Procedure Call and Array.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Machine-Level Representation of Programs Ⅱ.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Machine-Level Representation of Programs I.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Integer Operations; Floating Points.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Integer Representations.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Introduction to Computer Systems; Information is Bits+Context; Information Storage.ppt
- 复旦大学:《计算机原理 Computer System》课程资源_2006年期中考试题目.doc
- 复旦大学:《计算机原理 Computer System》课程资源_2006年期中考试答案.doc
- 复旦大学:《计算机原理 Computer System》课程资源_教学大纲.pdf
- 复旦大学:《计算机图形学》课后习题答案_7.docx
- 复旦大学:《计算机图形学》课后习题答案_6.docx
- 复旦大学:《计算机图形学》课后习题答案_5.docx
- 复旦大学:《计算机图形学》课后习题答案_4.docx
- 复旦大学:《计算机图形学》课后习题答案_3.docx
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Hardware Organization.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Memory Hierarchy(• Random-Access Memory(RAM)• Nonvolatile Memory • Disk Storage • Locality • Memory hierarchy).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Cache Memory(• General concepts • 3 ways to organize cache memory • Issues with writes • Write cache friendly codes • Cache mountain).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Cache Memory(• Cache mountain • Matrix multiplication).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Virtual Memory(• Virtual Space• Address translation • Accelerating translation• Different points of view).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Virtual Memory(• Multilevel page tables • Different points of view • Pentium/Linux Memory System • Memory Mapping).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Dynamic Memory Allocation(• Implementation of a simple allocator • Explicit Free List • Segregated Free List).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Linking II(• Static linking • Symbols & Symbol Table • Relocation • Executable Object Files • Loading).ppt
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter2.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level Representation of Programs.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level Representation of Programs.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level(2)Representation of Programs.ppt
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter4 Processor Architecture.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter5 Optimizing Program Performance.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter6 The Memory Hierarchy.ppt
- 复旦大学:《计算机网络与网页制作》课程教学大纲 Computer Network and Webpage Design.pdf
- 《当代教育理论与实践》论文:大学计算机基础教学实践与思考(复旦大学:肖川、张向东).pdf
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)01 计算机网络基础.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)02 两类基本网络(局域网、无线局域网).pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)03 因特网基础知识.pptx