中国高校课件下载中心 》 教学资源 》 大学文库

《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 10 Out of Order and Speculative Execution

文档信息
资源类别:文库
文档格式:PDF
文档页数:73
文件大小:1MB
团购合买:点击进入团购
内容简介
《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 10 Out of Order and Speculative Execution
刷新页面文档预览

高级计算机体系结构设计及其在数据中心和云计算的应用Lecture 10Speculation and Traps in Out-of-Order Cores

高级计算机体系结构设计及其在数据中心和云计算的应用 Lecture 10 Speculation and Traps in Out-of-Order Cores

高级计算机体系结构设计及其在数据中心和云计算的应用What is wrong with Tomasulo's?Branchinstructions Need branch prediction to guess what to fetch next- Need speculative execution to“clean up"wrong guessesExceptionsandTraps("software"interrupts)-Needtohandleuncommon execution cases.Jumptoasoftwarehandler- Should follow the insn.on which they were triggered-Often referred to as precise interruptsDon'tknowrelativeorderof instructionsin Rs

高级计算机体系结构设计及其在数据中心和云计算的应用 What is wrong with Tomasulo’s? • Branch instructions – Need branch prediction to guess what to fetch next – Need speculative execution to “clean up” wrong guesses • Exceptions and Traps (“software” interrupts) – Need to handle uncommon execution cases • Jump to a software handler – Should follow the insn. on which they were triggered – Often referred to as precise interrupts Don’t know relative order of instructions in RS

高级计算机体系结构设计及其在数据中心和云计算的应用Speculation and Precise InterruptsWhenbranchismis-speculatedbypredictor- Must reset state (e.g,. regs) to time of branchSequential semantics for interrupts-Allinsns.beforeinterruptshouldbecomplete- All insns. after interrupt should look as if never started (abort)Whatmakesthisdifficult?-Youngerinsns.finishbeforebranch→>mustundo writebacks Older insns. not done when young branch resolves → must wait.Olderinsn.takespagefault ordividebyzero→forget thebranchSameproblem→Samesolution

高级计算机体系结构设计及其在数据中心和云计算的应用 Speculation and Precise Interrupts • When branch is mis-speculated by predictor – Must reset state (e.g,. regs) to time of branch • Sequential semantics for interrupts – All insns. before interrupt should be complete – All insns. after interrupt should look as if never started (abort) • What makes this difficult? – Younger insns. finish before branch  must undo writebacks – Older insns. not done when young branch resolves  must wait • Older insn. takes page fault or divide by zero  forget the branch Same problem  Same solution

高级计算机体系结构设计及其在数据中心和云计算的应用Precise State·Speculative execution requires- (Ability to) abort & restart at every branch- Abort & restart at every load (covered in later lecture): Synchronous (exception and trap) events require- Abort & restart at every load, store, divide, ...Asynchronous(hardware)interruptsrequire- Abort & restart at every ??Real world: bite the bullet-Implementabort&restartateveryinsn-Calledprecisestate

高级计算机体系结构设计及其在数据中心和云计算的应用 Precise State • Speculative execution requires – (Ability to) abort & restart at every branch – Abort & restart at every load (covered in later lecture) • Synchronous (exception and trap) events require – Abort & restart at every load, store, divide, . • Asynchronous (hardware) interrupts require – Abort & restart at every ?? • Real world: bite the bullet – Implement abort & restart at every insn. – Called precise state

高级计算机体系结构设计及其在数据中心和云计算的应用Precise State Implementation Options: Imprecise state: ignore the problem!- Makes page faults (any restartable exceptions) difficult- Makes speculative execution practically impossibleForce in-order completion (W): stall pipe if necessary- Slow (takes away benefit of Out-of-Order)Keeptrackofprecisestateinhardware- Reset current state from precise state when neededEverythingisbetterinhardware

高级计算机体系结构设计及其在数据中心和云计算的应用 Precise State Implementation Options • Imprecise state: ignore the problem! – Makes page faults (any restartable exceptions) difficult – Makes speculative execution practically impossible • Force in-order completion (W): stall pipe if necessary – Slow (takes away benefit of Out Slow (takes away benefit of Out-of-Order) • Keep track of precise state in hardware – Reset current state from precise state when needed Everything is better in hardware

高级计算机体系结构设计及其在数据中心和云计算的应用Our-of-Order Topics"Scoreboardinq-FirstOoO,noregisterrenaming"Tomasulo'salgorithm"-OoOwithregisterrenamingHandlingprecisestateand speculation-P6-styleexecution(lntelPentiumPro)-R10k-styleexecution(MIPSR10k)Handling memory dependencies

高级计算机体系结构设计及其在数据中心和云计算的应用 Our-of-Order Topics • “Scoreboarding” – First OoO, no register renaming • “Tomasulo’s algorithm” – OoO with register renaming • Handling precise state and speculation – P6-style execution (Intel Pentium Pro) – R10k-style execution (MIPS R10k) • Handling memory dependencies

高级计算机体系结构设计及其在数据中心和云计算的应用The Problem with Precise StateinsnbufferregfileIsBPProblem:writebackcombinestwofunctions-Forwardvaluestoyoungerinsns.:out-of-orderisOK- Write values to registers:needs to be in orderSimilar solution as for OoO decode-Splitwritebackintotwostages

高级计算机体系结构设计及其在数据中心和云计算的应用 The Problem with Precise State regfile L1-D I$ B P insn buffer • Problem: writeback combines two functions – Forward values to younger insns.: out-of-order is OK – Write values to registers: needs to be in order • Similar solution as for OoO decode – Split writeback into two stages

高级计算机体系结构设计及其在数据中心和云计算的应用Re-OrderBuffer(ROB)Re-orderBuffer(ROB)reqfileIsBPInsn.buffer→Re-OrderBuffer(ROB)-Buffercompleted resultsenroutetoregisterfile Can be merged with RS (RUU) or separate (common today) Split writeback (W) into two stages-WhyistherenolatchbetweenW1andW2?

高级计算机体系结构设计及其在数据中心和云计算的应用 Re-Order Buffer (ROB) regfile L1-D I$ B P Re-Order Buffer (ROB) • Insn. buffer  Re-Order Buffer (ROB) – Buffer completed results en route to register file – Can be merged with RS (RUU) or separate (common today) • Split writeback (W) into two stages – Why is there no latch between W1 and W2?

高级计算机体系结构设计及其在数据中心和云计算的应用Complete and RetireRe-orderBuffer(ROB)eqfileIs上1BP. Complete (C): insns. write results into ROB- Out-of-order:don'tblockyoungerinsnsRetire (R): a.k.a. commit, graduate-ROB writes resultsto registerfileIn-order:stall back-propagatestoyoungerinsns

高级计算机体系结构设计及其在数据中心和云计算的应用 Complete and Retire regfile L1-D I$ B P Re-Order Buffer (ROB) C R • Complete (C): insns. write results into ROB – Out-of-order: don’t block younger insns. • Retire (R): a.k.a. commit, graduate – ROB writes results to register file – In-order: stall back-propagates to younger insns

高级计算机体系结构设计及其在数据中心和云计算的应用P6 Data StructuresP6: Start with Tomasulo's algorithm... add ROBROB (separatefrom RS)- head, tail: pointers maintain sequential order- R: insn. output register, V: insn. output valueTags are differentTomasulo:RS#→>P6:ROB#·Map Table is different- T+: tag +"ready-in-ROB" bit-T==0→Value is readyin registerfile-T!=O→Valueisnotready- T!=O+→ Value is ready in the ROB

高级计算机体系结构设计及其在数据中心和云计算的应用 P6 Data Structures • P6: Start with Tomasulo’s algorithm. add ROB • ROB (separate from RS) – head, tail: pointers maintain sequential order – R: insn. output register, V: insn. output value • Tags are different – Tomasulo: RS#  P6: ROB# • Map Table is different – T+: tag + “ready-in-ROB” bit – T==0  Value is ready in register file – T!=0  Value is not ready – T!=0+  Value is ready in the ROB

刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档