《现代计算机体系结构》课程教学课件(留学生版)Lecture 5 Out of Order Execution

ComputerArchitectureOut of Order ExecutionComputerArchitecture
Computer Architecture Computer Architecture Out of Order Execu3on

Tomasulo AlgorithmHistory:In today's perspective:1966: scoreboarding inthe implementationCDC6600,implementingbelongsto reservation-limited dynamicbased dynamic schedulingschedulingThree years later:The merit is the use of tagTomasuloinIBM360/91to identify instructions inintroducing registerthe fly: register renamingrenaming and reservationstationreservation station, tagbroadcastNowappearingintodaysDec Alpha, SGI MIPS, SUNUltraSparc,Intel Pentium,It is an“algorithm"firstIBM PowerPC, and othersComputerArchitecture
Computer Architecture Tomasulo Algorithm History: • 1966: scoreboarding in CDC6600, implemen?ng limited dynamic scheduling • Three years later: Tomasulo in IBM 360/91, introducing register renaming and reserva?on sta?on • Now appearing in todays Dec Alpha, SGI MIPS, SUN UltraSparc, Intel Pen?um, IBM PowerPC, and others In today’s perspec?ve: • the implementa?on belongs to reserva?onbased dynamic scheduling • The merit is the use of tag to iden?fy instruc?ons in the fly: register renaming, reserva?on sta?on, tag broadcast It is an “algorithm” first

TomasuloOrganizationFrom instruction unitInstructionFPregistersqueueLoad-storeoperationsOperandFloating-pointAddressunitbusesoperationsStorebuffersLoadbuffersVOperation bus32121ReservationstationsAddressData?FPmultipliersFP addersMemoryunltCommondata bus (CDB)ComputerArchitecture
Computer Architecture Tomasulo Organiza?on

ReservationStationComponentsOp-Operation to perform in the unit (e.g., + or -)Vi,Vk-ValueofSourceoperands-StorebuffershasVfield,resulttobestoredQi,Qk-Reservationstationsproducingsourceregisters(value to be written)- Note: No ready flags as in Scoreboard; Qj,Qk=0 => ready- Store buffers onlyhave Qifor RS producingresultBusy-lndicates reservation station or FU is busyComputerArchitecture
Computer Architecture Reserva?on Sta?on Components Op—Opera?on to perform in the unit (e.g., + or –) Vj, Vk—Value of Source operands – Store buffers has V field, result to be stored Qj, Qk—Reserva?on sta?ons producing source registers (value to be wri_en) – Note: No ready flags as in Scoreboard; Qj,Qk=0 => ready – Store buffers only have Qi for RS producing result Busy—Indicates reserva?on sta?on or FU is busy

RenamingTableIndicateswhich RSwill write each register,if oneexists. Blank when no pending instructions that willwrite that register.-Called“Registerresult status"in thetext book.ComputerArchitecture
Computer Architecture Renaming Table • Indicates which RS will write each register, if one exists. Blank when no pending instruc?ons that will write that register. – Called “Register result status” in the text book

ThreeStagesof Tomasulo Algorithm1. Issue-get instruction from FP Op QueueIf reservationstationfree (no structural hazard)issues instr & sends operands (renames registers)2. Execution-operate on operands (Ex)Whenbothoperandsreadythenexecute;if not ready,watch Common Data Bus for result3. Write result-finish execution (WB)Write on Common Data Bus to all awaiting units;markreservationstationavailableIssue:builddependencefornewinstWriteback:Wakeupdependent instructionsComputerArchitecture
Computer Architecture Three Stages of Tomasulo Algorithm 1. Issue—get instruc?on from FP Op Queue If reserva?on sta?on free (no structural hazard), issues instr & sends operands (renames registers). 2. Execution—operate on operands (EX) When both operands ready then execute; if not ready, watch Common Data Bus for result 3. Write result—finish execu?on (WB) Write on Common Data Bus to all awai?ng units; mark reserva?on sta?on available Issue: build dependence for new inst Writeback: Wakeup dependent instruc?ons

ILP: Tomasulo Algorithm - An ExampleExplanationsofeachstages,AbigexampleComputerArchitecture
Computer Architecture ILP: Tomasulo Algorithm – An Example Explana?ons of each stages, A big example

ThreeStagesofTomasuloAlgorithm1.Issue-get instructionfromFPOpQueueIf reservation stationfree (no structural hazard)issues instr & sends operands (renames registers).2. Execution-operate on operands (Ex)Whenbothoperands readythenexecute;if not ready, watch Common Data Bus for result3. Write result-finish execution (WB)Write on Common Data Bus to all awaiting units;markreservationstationavailableIssue:builddependencefornewinstWriteback:WakeupdependentinstructionsComputerArchitecture
Computer Architecture Three Stages of Tomasulo Algorithm 1. Issue—get instruc?on from FP Op Queue If reserva?on sta?on free (no structural hazard), issues instr & sends operands (renames registers). 2. Execution—operate on operands (EX) When both operands ready then execute; if not ready, watch Common Data Bus for result 3. Write result—finish execu?on (WB) Write on Common Data Bus to all awai?ng units; mark reserva?on sta?on available Issue: build dependence for new inst Writeback: Wakeup dependent instruc?ons

IssueStageandRenamingTableRenames its two sourceregisters (source renaming)AssignsittoafreeRsUpdates Renaming table (dest renaming). Also decodes the inst and read register values in parallelComputerArchitecture
Computer Architecture Issue Stage and Renaming Table • Renames its two source registers (source renaming) • Assigns it to a free RS • Updates Renaming table (dest renaming) • Also decodes the inst and read register values in parallel

ExecuteStage· Only"ready"instructions can join the competition: There is a select logic to select instructions for FUexecution- Some policy may be used, e.g. age basedNon-ready instructions can be“waken up" duringwriteback of its parent instComputerArchitecture
Computer Architecture Execute Stage • Only “ready” instruc?ons can join the compe??on • There is a select logic to select instruc?ons for FU execu?on – Some policy may be used, e.g. age based • Non-ready instruc?ons can be “waken up” during writeback of its parent inst
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第4章 基于统计决策的概率分类法.ppt
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第1章 绪论、第2章 聚类分析.ppt
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第3章 判别函数及几何分类法.ppt
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第7章 模糊模式识别法.ppt
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第6章 句法模式识别.ppt
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第5章 特征选择与特征提取.ppt
- 武汉理工大学:《模式识别》课程教学资源(PPT课件)第8章 神经网络模式识别法.ppt
- 武汉理工大学:《模式识别》课程教学资源(实验指导,共五个实验).pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第8章 神经网络在模式识别中的应用.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第7章 模糊模式识别.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第6章 特征提取与选择.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第5章 聚类分析.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第4章 非参数判别分类方法.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第3章 概率密度函数的参数估计.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第2章 贝叶斯决策理论.pdf
- 武汉理工大学:《模式识别》课程授课教案(讲义)第1章 绪论.pdf
- 武汉理工大学:《模式识别》课程教学大纲 Pattern Recognition(研究生).pdf
- 《高性能计算机网络》课程教学课件(讲义)第十章 大数据之Web典型应用 第56讲 推荐系统简介.pdf
- 《高性能计算机网络》课程教学课件(讲义)第十章 大数据之Web典型应用 第55讲 Lucene信息检索平台.pdf
- 《高性能计算机网络》课程教学课件(讲义)第十章 大数据之Web典型应用 第54讲 Web信息检索简介.pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 6 Memory Hierarchy and Cache.pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 4 Spectualtive Execution.pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 7 Multiprocessors.pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 2 Instruction Set Architecture(Microarchitecture Implementation).pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 3 Pipelining.pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 0 Introduction and Performance Evaluation.pdf
- 《现代计算机体系结构》课程教学课件(留学生版)Lecture 1 Instruction Set Architecture(Introduction).pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 12 Shared Memory Multiprocessor.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 15 GPGPU Architecture and Programming Paradigm.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 14 Towards Renewable Energy Powered Sustainable and Green Cloud Datacenters.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 11 Multi-core and Multi-threading.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 10 Out of Order and Speculative Execution.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 13 An Introduction to Cloud Data Centers.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 09 Case Study- Jave Branch Prediction Optimization.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 07 Instruction Decode.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 08 Instruction Fetch and Branch Predictioin.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 06 Scoreboarding and Tomasulo.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 04 Memory Data Prefetching.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 05 Core Pipelining.pdf
- 《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 02 Memory Hierarchy and Caches.pdf