中国高校课件下载中心 》 教学资源 》 大学文库

《数据库系统概念 Database System Concepts》原书教学资源(第七版,PPT课件讲稿,英文版)Chapter 22 Parallel and Distributed Query Processing

文档信息
资源类别:文库
文档格式:PPTX
文档页数:76
文件大小:2.24MB
团购合买:点击进入团购
内容简介
▪ Overview ▪ Parallel Sort ▪ Parallel Join ▪ Other Operations ▪ Parallel Evaluation of Query Plans ▪ Query Processing on Shared Memory ▪ Query Optimization ▪ Distributed Query Processing
刷新页面文档预览

Chapter 22:Parallel And Distributed Query Processing ■Overview ■Parallel Sort ■Parallel Join ■Other Operations Parallel Evaluation of Query Plans Query Processing on Shared Memory Query Optimization Distributed Query Processing Database System Concepts-7th Edition 22.2 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.2 ©Silberschatz, Korth and Sudarshan th Edition Chapter 22: Parallel And Distributed Query Processing ▪ Overview ▪ Parallel Sort ▪ Parallel Join ▪ Other Operations ▪ Parallel Evaluation of Query Plans ▪ Query Processing on Shared Memory ▪ Query Optimization ▪ Distributed Query Processing

Parallel Query Processing Different queries/transactions can be run in parallel with each other. Interquery parallelism Concurrency control takes care of conflicts in case of updates More on parallel transaction processing in Chapter 23 Focus in this chapter is on read-only queries Individual relational operations (e.g.,sort,join,aggregation)can be executed in parallel data can be partitioned and each processor can work independently on its own partition. Queries are expressed in high level language (SQL,translated to relational algebra) makes parallelization easier. Database System Concepts-7th Edition 22.3 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.3 ©Silberschatz, Korth and Sudarshan th Edition Parallel Query Processing ▪ Different queries/transactions can be run in parallel with each other. • Interquery parallelism • Concurrency control takes care of conflicts in case of updates • More on parallel transaction processing in Chapter 23 • Focus in this chapter is on read-only queries ▪ Individual relational operations (e.g., sort, join, aggregation) can be executed in parallel • data can be partitioned and each processor can work independently on its own partition. ▪ Queries are expressed in high level language (SQL, translated to relational algebra) • makes parallelization easier

Intraquery Parallelism Intraquery parallelism:execution of a single query in parallel on multiple processors/disks;important for speeding up long-running queries. Two complementary forms of intraquery parallelism: Intraoperation Parallelism-parallelize the execution of each individual operation in the query Supports high degree of parallelism Interoperation Parallelism-execute the different operations in a query expression in parallel. Limited degree of parallelism Database System Concepts-7th Edition 22.4 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.4 ©Silberschatz, Korth and Sudarshan th Edition Intraquery Parallelism ▪ Intraquery parallelism: execution of a single query in parallel on multiple processors/disks; important for speeding up long-running queries. ▪ Two complementary forms of intraquery parallelism: • Intraoperation Parallelism– parallelize the execution of each individual operation in the query ▪ Supports high degree of parallelism • Interoperation Parallelism– execute the different operations in a query expression in parallel. ▪ Limited degree of parallelism

Parallel Processing of Relational Operations Our discussion of parallel algorithms assumes: ·read-only queries shared-nothing architecture ·n nodes,N1,,Nn Each assumed to have disks and processors. Initial focus on parallelization to a shared-nothing node Parallel processing within a shared memory/shared disk node discussed later Shared-nothing architectures can be efficiently simulated on shared- memory and shared-disk systems. Algorithms for shared-nothing systems can thus be run on shared-memory and shared-disk systems. However,some optimizations may be possible. Database System Concepts-7th Edition 22.5 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.5 ©Silberschatz, Korth and Sudarshan th Edition Parallel Processing of Relational Operations ▪ Our discussion of parallel algorithms assumes: • read-only queries • shared-nothing architecture • n nodes, N1 , ..., Nn ▪ Each assumed to have disks and processors. • Initial focus on parallelization to a shared-nothing node ▪ Parallel processing within a shared memory/shared disk node discussed later • Shared-nothing architectures can be efficiently simulated on shared￾memory and shared-disk systems. ▪ Algorithms for shared-nothing systems can thus be run on shared-memory and shared-disk systems. ▪ However, some optimizations may be possible

Intraoperation Parallelism Database System Concepts -7th Edition 22.6 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.6 ©Silberschatz, Korth and Sudarshan th Edition Intraoperation Parallelism

Range Partitioning Redistribute using partitioning vector: 100,175,300,500,800 0-100) [100-175)[175-300)[300-500)[500-800)[800-1000) Database System Concepts-7th Edition 22.7 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.7 ©Silberschatz, Korth and Sudarshan th Edition Range Partitioning

Range-Partitioning Parallel Sort 1)Redistribute using partitioning vector: 100.175,300,500.800 0-100) [100-175) [175-300)[300-500)[500-800)[800-1000) 2)(External)sort locally at each node 3)Merge if output required at one node Database System Concepts-7th Edition 22.8 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.8 ©Silberschatz, Korth and Sudarshan th Edition Range-Partitioning Parallel Sort

Parallel Sort Local Sort Local Sort Merge r Local Sort Local Sort Merge Local Sort Local Sort Merge : Local Sort Local Sort Merge Im 1.Range Partition 2.Local Sort 1.Local Sort 2.Range Partition and Merge (a)Range Partitioning Sort (b)Parallel External Sort-Merge Database System Concepts-7th Edition 22.9 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.9 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort

Parallel Sort Range-Partitioning Sort ■ Choose nodes N,...,Nmm,where m n-1 to do sorting. Create range-partition vector with m-1 entries,on the sorting attributes Redistribute the relation using range partitioning Each node N sorts its partition of the relation locally. Example of data parallelism:each node executes same operation in parallel with other nodes,without any interaction with the others. Final merge operation is trivial:range-partitioning ensures that, if i<j,all key values in node Nare all less than all key values in N Database System Concepts-7th Edition 22.10 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.10 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort Range-Partitioning Sort ▪ Choose nodes N1 , ..., Nm, where m  n -1 to do sorting. ▪ Create range-partition vector with m-1 entries, on the sorting attributes ▪ Redistribute the relation using range partitioning ▪ Each node Ni sorts its partition of the relation locally. • Example of data parallelism: each node executes same operation in parallel with other nodes, without any interaction with the others. ▪ Final merge operation is trivial: range-partitioning ensures that, if i < j, all key values in node Ni are all less than all key values in Nj

Parallel Sort (Cont.) Parallel External Sort-Merge Assume the relation has already been partitioned among nodes N1,...N (in whatever manner). Each node N locally sorts the data (using local disk as required) The sorted runs on each node are then merged in parallel: The sorted partitions at each node Ni are range-partitioned across the processors N1,...,N. Each node Ni performs a merge on the streams as they are received,to get a single sorted run. The sorted runs on nodes N1,...,N are concatenated to get the final result. Algorithm as described vulnerable to execution skew all nodes send to node 1,then all nodes send data to node 2, Can be modified so each node sends data to all other nodes in parallel (block at a time) Database System Concepts-7th Edition 22.11 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.11 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort (Cont.) Parallel External Sort-Merge ▪ Assume the relation has already been partitioned among nodes N1 , ..., Nn (in whatever manner). ▪ Each node Ni locally sorts the data (using local disk as required) ▪ The sorted runs on each node are then merged in parallel: • The sorted partitions at each node Ni are range-partitioned across the processors N1 , ..., Nn . • Each node Ni performs a merge on the streams as they are received, to get a single sorted run. • The sorted runs on nodes N1 , ..., Nn are concatenated to get the final result. ▪ Algorithm as described vulnerable to execution skew • all nodes send to node 1, then all nodes send data to node 2, … • Can be modified so each node sends data to all other nodes in parallel (block at a time)

刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档