中国高校课件下载中心 》 教学资源 》 大学文库

《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 11 Multi-core and Multi-threading

文档信息
资源类别:文库
文档格式:PDF
文档页数:36
文件大小:2.84MB
团购合买:点击进入团购
内容简介
《现代计算机体系结构》课程教学课件(英文讲稿)Lecture 11 Multi-core and Multi-threading
刷新页面文档预览

高级计算机体系结构设计及其在数据中心和云计算的应用Lecture 11Multi-{Socket,Core,Thread]

高级计算机体系结构设计及其在数据中心和云计算的应用 Lecture 11 Multi-{Socket,Core,Thread}

高级计算机体系结构设计及其在数据中心和云计算的应用GettingMore Performance· Keep pushing IPC and/or frequenecy- Design complexity (time to market)- Cooling (cost)- Power delivery (cost)Possible,but too costly

高级计算机体系结构设计及其在数据中心和云计算的应用 Getting More Performance • Keep pushing IPC and/or frequenecy – Design complexity (time to market) – Cooling (cost) – Power delivery (cost) – . • Possible, but too costly

高级计算机体系结构设计及其在数据中心和云计算的应用Bridging the GapWatts/IPCPower has been growingexponentiallyaswell100101Diminishingreturns w.r.t.largerinstructionwindow,higherissue-widthSingle-IssueLimitsSuperscalarSuperscalarPipelinedOut-of-OrderOut-of-Order(Today)(Hypothetical-Aggressive)

高级计算机体系结构设计及其在数据中心和云计算的应用 Bridging the Gap IPC 100 10 Power has been growing exponentially as well Watts / 1 Single-Issue Pipelined Superscalar Out-of-Order (Today) Superscalar Out-of-Order (Hypothetical￾Aggressive) Limits Diminishing returns w.r.t. larger instruction window, higher issue-width

高级计算机体系结构设计及其在数据中心和云计算的应用Higher Complexity not Worth EffortPerformanceMadesensetogoSuperscalar/OO:goodROlVerylittlegain forsubstantialeffort"Effort"ScalarModerate-PipeVery-Deep-PipeIn-OrderSuperscalar/000AggressiveSuperscalar/000

高级计算机体系结构设计及其在数据中心和云计算的应用 Higher Complexity not Worth Effort Performance Made sense to go Superscalar/OOO: good ROI Very little gain for substantial effort “Effort” Scalar In-Order Moderate-Pipe Superscalar/OOO Very-Deep-Pipe Aggressive Superscalar/OOO

高级计算机体系结构设计及其在数据中心和云计算的应用User Visible/Invisible. All performance gains up to this point were“free"- No user intervention required (beyond buying new chip)·Recompilation/rewritingcouldprovideevenmorebenefit-Higherfrequency&higherIPC- Same IsA, different micro-architecture:Multi-processing pushes parallelism above ISA-Coarsegrainedparallelism.Providemultipleprocessingelements- User (or developer) responsible for finding parallelism·User decides howto use resources

高级计算机体系结构设计及其在数据中心和云计算的应用 User Visible/Invisible • All performance gains up to this point were “free” – No user intervention required (beyond buying new chip) • Recompilation/rewriting could provide even more benefit – Higher frequency & higher IPC – Same ISA, different micro-architecture • Multi-processing pushes parallelism above ISA – Coarse grained parallelism • Provide multiple processing elements – User (or developer) responsible for finding parallelism • User decides how to use resources

高级计算机体系结构设计及其在数据中心和云计算的应用Sources of (Coarse) Parallelism.Differentapplications-MP3playerinbackground whileyouwork inOffice- Other background tasks: Os/kernel, virus check, etc...- Piped applicationsgunzip-cfoo.gzIgrepbarIperlsome-script.plThreads within the same applicationJava(scheduling,GC,etc...)- Explicitly coded multi-threading.pthreads,MPl,etc

高级计算机体系结构设计及其在数据中心和云计算的应用 Sources of (Coarse) Parallelism • Different applications – MP3 player in background while you work in Office – Other background tasks: OS/kernel, virus check, etc. – Piped applications • gunzip -c foo.gz | grep bar | perl some c foo.gz | grep bar | perl some-script.pl script.pl • Threads within the same application – Java (scheduling, GC, etc.) – Explicitly coded multi-threading • pthreads, MPI, etc

高级计算机体系结构设计及其在数据中心和云计算的应用SMp MachinesSMP= SymmetricMulti-Processing- Symmetric = All cPUs have“equal" access to memoryOS seems multiple CPUs-Runsoneprocess(orthread)oneachCPUCPU。CPU,CPU2CPU3

高级计算机体系结构设计及其在数据中心和云计算的应用 SMP Machines • SMP = Symmetric Multi-Processing – Symmetric = All CPUs have “equal” access to memory • OS seems multiple CPUs – Runs one process (or thread) on each CPU CPU0 CPU1 CPU2 CPU3

高级计算机体系结构设计及其在数据中心和云计算的应用MpWorkloadBenefitsruntimeTaskATask B3-wide000CPUTaskATask B4-wide000CPUBenefit3-wide3-wideTaskATask B00000CPUCPUTaskA2-wide2-wideTask B000000CPUCPUAssumesyouhavemultipletasks/programsto run

高级计算机体系结构设计及其在数据中心和云计算的应用 MP Workload Benefits 3-wide OOO CPU Task A Task B 4-wide OOO Task A Task B runtime CPU Benefit 3-wide OOO CPU Task A Task B 3-wide OOO CPU 2-wide OOO CPU Task B 2-wide Task A OOO CPU Assumes you have multiple tasks/programs to run

高级计算机体系结构设计及其在数据中心和云计算的应用... If Only One Task AvailableruntimeTaskA3-wide000CPUTaskA4-wideBenefit000CPU3-wideTaskAWIdENobenefitoverICPU00080CPUCPUTaskA2-widePerformance0000CPUSDdegradation!Idle

高级计算机体系结构设计及其在数据中心和云计算的应用 . If Only One Task Available 3-wide OOO CPU Task A 4-wide OOO Task A Benefit runtime CPU 3-wide OOO CPU 3-wide OOO CPU Task A 2-wide OOO CPU 2-wide OOO CPU Task A Idle No benefit over 1 CPU Performance degradation!

高级计算机体系结构设计及其在数据中心和云计算的应用Benefit of Mp Depends on WorkloadLimited number of parallel tasks to run on PC-Adding moreCPUs thantasksprovidezero benefitForparallel code,Amdahl'slaw curbs speedupparallelizableICPU2CPUs3CPUs4CPUs

高级计算机体系结构设计及其在数据中心和云计算的应用 Benefit of MP Depends on Workload • Limited number of parallel tasks to run on PC – Adding more CPUs than tasks provide zero benefit • For parallel code, Amdahl’s law curbs speedup parallelizable 1CPU 2CPUs 3CPUs 4CPUs

刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档