中国高校课件下载中心 》 教学资源 》 大学文库

《系统软件与软件安全》课程教学课件(PPT讲稿,英文)Lecture-4-LSbM-tree

文档信息
资源类别:文库
文档格式:PPTX
文档页数:42
文件大小:1.45MB
团购合买:点击进入团购
内容简介
《系统软件与软件安全》课程教学课件(PPT讲稿,英文)Lecture-4-LSbM-tree
刷新页面文档预览

LSbM-tree:一个读写兼优的大数据存储结构

LSbM-tree:一个读写兼优的大数据存储结构 1

2MajorDataFormatsinStorageSystems: Sequentially archived data- Indexed data, e.g. sorted data by a defined key, ...- Read/write largely by B+-tree and LSM-tree.Relational tables- Structured data formats for relational databases, e.g. MySQL- Read/write operations by relational algebra/calculus·Key-value store-Apair ofkey/value fora data item, e.g.redis, memcached-Read/write: request -> index -> fetching data·Graph-databases? Free-style text files-A file may be retrieved by KV-store, indexed directory,

Major Data Formats in Storage Systems • Sequentially archived data – Indexed data, e.g. sorted data by a defined key, . – Read/write largely by B+-tree and LSM-tree • Relational tables – Structured data formats for relational databases, e.g. MySQL – Read/write operations by relational algebra/calculus • Key-value store – A pair of key/value for a data item, e.g. redis, memcached – Read/write: request -> index –> fetching data • Graph-databases • Free-style text files – A file may be retrieved by KV-store, indexed directory, . 2

3New Challenges to AccessPerformance in Big Data.Sequentially archived data-Canwemassivelyprocessbothreadsandwritesconcurrently?-butLSM-treefavorswritesandB+-treefavorsreads.Relationaltables- Tables must partitioned/placed among many nodes,e.g.Apache Hive- How to minimize data transfers among nodes and from local disks?·Key-valuestore-Keyindexingbecomesabottleneckas#concurrentrequestsincrease-Howtoacceleratedataaccessesforin-memorykey-valuestore?

New Challenges to Access Performance in Big Data • Sequentially archived data – Can we massively process both reads and writes concurrently? – but LSM-tree favors writes and B+-tree favors reads • Relational tables – Tables must partitioned/placed among many nodes, e.g. Apache Hive – How to minimize data transfers among nodes and from local disks? • Key-value store – Key indexing becomes a bottleneck as # concurrent requests increase – How to accelerate data accesses for in-memory key-value store? 3

Fast Accesses to Sequentially/ArchivedDatain both memory and disks

Fast Accesses to Sequentially Archived Data in both memory and disks 4

What is LSM-tree?It is a Log-structured merge-tree (1996):Multiple levels of sorted data, (e.g. each by a B+ tree)Each level increases exponentially, forming a “pyramid'The smallest level is in memory and the rest on diskCoDRAMHDDC1C2

What is LSM-tree? It is a Log-structured merge-tree (1996): • Multiple levels of sorted data, (e.g. each by a B+ tree) • Each level increases exponentially, forming a “pyramid” • The smallest level is in memory and the rest on disk C0 C1 C2 C0 C1 C2 HDD DRAM 5

LsM-treeiswidely usedinproduction systemsAPACHGoogleBigTableHBASEDTANLBSTMUEUROcassandraRocksDBMariaDBlevelDBb

LSM-tree is widely used in production systems 6 RocksDB

Why Log-structured merge-tree(LSM-tree)?B+treeInsert2.5- In-placeupdateseekRandoml/Os-Lowwritethroughput2.5dddad,LSM-treewrites- Log-structuredupdate-Merge/compactionforsortingDRAM-Sequentiall/OsHDDmerges-Highwritethroughput

Why Log-structured merge-tree (LSM-tree)? • B+ tree – In-place update – Random I/Os – Low write throughput • LSM-tree – Log-structure for sequential I/O – Merge for sorting – Batched I/O operations – High write throughput 7 2.5 d2 ’ seek Insert 2.5 • LSM-tree – Log-structured update – Merge/compaction for sorting – Sequential I/Os – High write throughput 7 writes merges HDD DRAM

8SQLite Experiences of LSM-tree by Richard HippLog StructuredMerge(LSM)BadGood-Slowerreads.FasterwritesBackgroundmergeReduced writeprocessamplificationMorespaceondisk-Linearwrites.Greatercomplexity.LessSSDwear

SQLite Experiences of LSM-tree by Richard Hipp 8

AbuffercacheproblemreportedbyCassandrain2014CASSANDRA-6916-readoaC*2.1 beta1 vs beta2110,000effective caching100,00090,00080.00070,00060.00050,00warmingup40.00030,00020,000disenabled caching10,000250t5o20010060Total operation time (seconds)*https://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-219

A buffer cache problem reported by Cassandra in 2014 9 *https://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21 warming up effective caching disenabled caching

BasicFunctionsofBufferCacheBuffer cache is in DRAM or other fast devicesData entries in buffer cache refer to disk blocks (page)Cache the frequently read disk blocks for reuseBuffer cache.disk2a10

Basic Functions of Buffer Cache • Buffer cache is in DRAM or other fast devices • Data entries in buffer cache refer to disk blocks (page) • Cache the frequently read disk blocks for reuse 10 b c a disk Buffer cache a c b

刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档