复旦大学:《计算机原理 Computer System》课程PPT课件_Cache Memory(• Cache mountain • Matrix multiplication)

Cache Memory
1 Cache Memory

Outline Cache mountain Matrix multiplication Suggested Reading: 6.6, 6.7
2 Outline • Cache mountain • Matrix multiplication • Suggested Reading: 6.6, 6.7

6.6 Putting it Together: The Impact of Caches on Program Performance 6.6.1 The Memory Mountain
3 6.6 Putting it Together: The Impact of Caches on Program Performance 6.6.1 The Memory Mountain

The memory Mountain P512 Read throughput (read bandwidth) The rate that a program reads data from the memory system Memory mountain A two-dimensional function of read bandwidth versus temporal and spatial locality Characterizes the capabilities of the memory system for each computer
4 The Memory Mountain P512 • Read throughput (read bandwidth) – The rate that a program reads data from the memory system • Memory mountain – A two-dimensional function of read bandwidth versus temporal and spatial locality – Characterizes the capabilities of the memory system for each computer

Memory mountain main routine Figure 6.41 P513 / mountain c- Generate the memory mountain. #define minbytes(1 <<10)/working set size ranges from 1 KB*/ #define maxbYtes (1 < 23)/...up to 8 MB*/ #define maxstride 16 / strides range from 1 to 16*/ #define maXelems maXbytes/sizeof(int) int data MAXELEMSI: The array well be traversing
5 Memory mountain main routine Figure 6.41 P513 /* mountain.c - Generate the memory mountain. */ #define MINBYTES (1 << 10) /* Working set size ranges from 1 KB */ #define MAXBYTES (1 << 23) /* ... up to 8 MB */ #define MAXSTRIDE 16 /* Strides range from 1 to 16 */ #define MAXELEMS MAXBYTES/sizeof(int) int data[MAXELEMS]; /* The array we'll be traversing */

Memory mountain main routine int maino int size, / Working set size(in bytes)*/ int stride: / Stride (in array elements)*/ double mhz; / Clock frequency * init data(data, MAXELEMS); /Initialize each element in data to 1 */ Mhz=mhz) / Estimate the clock frequency x
6 Memory mountain main routine int main() { int size; /* Working set size (in bytes) */ int stride; /* Stride (in array elements) */ double Mhz; /* Clock frequency */ init_data(data, MAXELEMS); /* Initialize each element in data to 1 */ Mhz = mhz(0); /* Estimate the clock frequency */

Memory mountain main routine for(size= MAXBYTES; size > MINBYTES; size >>=1)t for(stride= 1; stride<= MAXSTRIDE; stride++) printf( o.Ift",run(size, stride, Mhz)) printf("n; exit(0);
7 Memory mountain main routine for (size = MAXBYTES; size >= MINBYTES; size >>= 1) { for (stride = 1; stride <= MAXSTRIDE; stride++) printf("%.1f\t", run(size, stride, Mhz)); printf("\n"); } exit(0); }

Memory mountain test function Figure 6.40 P512 / The test function * void test (int elems, int stride)i inti result=0 volatile int sink: for(i=0; i< elems; i+= stride) result += datai; sink= result: /*So compiler doesn't optimize away the loop
8 Memory mountain test function Figure 6.40 P512 /* The test function */ void test (int elems, int stride) { int i, result = 0; volatile int sink; for (i = 0; i < elems; i += stride) result += data[i]; sink = result; /* So compiler doesn't optimize away the loop */ }

Memory mountain test function /*Run test(elems, stride) and return read throughput MB/s)*/ double run (int size, int stride, double mhz) double cycles; int elems= size/ sizeof(int) test(elems, stride); / warm up the cache * cycles= fcyc2(test, elems, stride, 0); /*call test(elems, stride)*/ return(size/stride)/(cycles /Mhz); /*convert cycles to MB/s*/
9 Memory mountain test function /* Run test (elems, stride) and return read throughput (MB/s) */ double run (int size, int stride, double Mhz) { double cycles; int elems = size / sizeof(int); test (elems, stride); /* warm up the cache */ cycles = fcyc2(test, elems, stride, 0); /* call test (elems,stride) */ return (size / stride) / (cycles / Mhz); /* convert cycles to MB/s */ }

The memory Mountain Data Size MAXByTES(8M)bytes or MAXELEMS(2M)words Partially accessed Working set: from 8MB to 1KB Stride: from 1 to 16
10 The Memory Mountain • Data – Size • MAXBYTES(8M) bytes or MAXELEMS(2M) words – Partially accessed • Working set: from 8MB to 1KB • Stride: from 1 to 16
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Cache Memory(• General concepts • 3 ways to organize cache memory • Issues with writes • Write cache friendly codes • Cache mountain).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Memory Hierarchy(• Random-Access Memory(RAM)• Nonvolatile Memory • Disk Storage • Locality • Memory hierarchy).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Hardware Organization.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_13 Code Optimization(• Optimizing Blockers • Understanding Modern Processor • More Code Optimization techniques • Performance Tuning).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_12b Code Optimization(• Machine-Independent Optimization – Code motion – Memory optimization • Suggested reading).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Pipelined Implementation Part II.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Pipelined Implementation Part I.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_09、10 Sequential CPU Implementation.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Processor Architecture.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Heterogeneous Data Structures & Alignment; Putting it Together; Floating Point.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Procedure Call and Array.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Machine-Level Representation of Programs Ⅱ.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Machine-Level Representation of Programs I.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Integer Operations; Floating Points.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Integer Representations.ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Introduction to Computer Systems; Information is Bits+Context; Information Storage.ppt
- 复旦大学:《计算机原理 Computer System》课程资源_2006年期中考试题目.doc
- 复旦大学:《计算机原理 Computer System》课程资源_2006年期中考试答案.doc
- 复旦大学:《计算机原理 Computer System》课程资源_教学大纲.pdf
- 复旦大学:《计算机图形学》课后习题答案_7.docx
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Virtual Memory(• Virtual Space• Address translation • Accelerating translation• Different points of view).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Virtual Memory(• Multilevel page tables • Different points of view • Pentium/Linux Memory System • Memory Mapping).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Dynamic Memory Allocation(• Implementation of a simple allocator • Explicit Free List • Segregated Free List).ppt
- 复旦大学:《计算机原理 Computer System》课程PPT课件_Linking II(• Static linking • Symbols & Symbol Table • Relocation • Executable Object Files • Loading).ppt
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter2.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level Representation of Programs.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level Representation of Programs.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_Chapter 3 Machine-Level(2)Representation of Programs.ppt
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter4 Processor Architecture.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter5 Optimizing Program Performance.pptx
- 复旦大学:《计算机原理 Computer System》习题PPT课件_chapter6 The Memory Hierarchy.ppt
- 复旦大学:《计算机网络与网页制作》课程教学大纲 Computer Network and Webpage Design.pdf
- 《当代教育理论与实践》论文:大学计算机基础教学实践与思考(复旦大学:肖川、张向东).pdf
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)01 计算机网络基础.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)02 两类基本网络(局域网、无线局域网).pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)03 因特网基础知识.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)04 传统的Internet服务.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)05 新型的Internet应用.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)06 Internet安全.pptx
- 复旦大学:《计算机网络与网页制作》课程PPT教学课件(讲稿)07 Dreamweaver CS5入门.pptx