MSCIT 5210/MSCBD 5002:Knowledge Discovery and Data Mining:Chapter 4:Data Warehousing, On-line Analytical Processing and Data Cube

MSCIT 5210/MSCBD 5002: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by dr Lei chen based on the slides provided by jiawei Han micheline Kamber and Jian pei @2012 Han, Kamber pei. all rights reserved
1 1 MSCIT 5210/MSCBD 5002: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by Dr. Lei Chen based on the slides provided by Jiawei Han, Micheline Kamber, and Jian Pei © 2012 Han, Kamber & Pei. All rights reserved

Chapter 4: Data Warehousing, On-line Analytical Processing and Data Cube Data Warehouse Basic Concepts a Data Warehouse Modeling: Data Cube and OLAP Data Cube Computation: Preliminary Concepts Data Cube Computation Methods Summary
2 Chapter 4: Data Warehousing, On-line Analytical Processing and Data Cube ◼ Data Warehouse: Basic Concepts ◼ Data Warehouse Modeling: Data Cube and OLAP ◼ Data Cube Computation: Preliminary Concepts ◼ Data Cube Computation Methods ◼ Summary

Aspects of SQL Most common Query language -used in all commercial systems Discussion is based on the SQL92 Standard Commercial products have different features of SQL, but the basic structure is the same Data Manipulation Language Data Definition Language Constraint Specification Embedded SQL Transaction Management Security Management COMP3311 Fall 2011 CSE, HKUST Slide 3
COMP3311 Fall 2011 CSE, HKUST Slide 3 Aspects of SQL ▪ Most common Query Language – used in all commercial systems • Discussion is based on the SQL92 Standard. Commercial products have different features of SQL, but the basic structure is the same ▪ Data Manipulation Language ▪ Data Definition Language ▪ Constraint Specification ▪ Embedded SQL ▪ Transaction Management ▪ Security Management

Basic structure SQL is based on set and relational operations with certain modifications and enhancements a typical SQL query has the form select a1,A2,…,An from r1,R,…R where p A represent attributes R represent relations P is a predicate This query is equivalent to the relational algebra expression: A1.2.An(op(R1×R2×…×Rm) The result of an sQl query is a relation(but may contain duplicates). SQL statements can be nested COMP3311 Fall 2011 CSE, HKUST Slide 4
COMP3311 Fall 2011 CSE, HKUST Slide 4 Basic Structure • SQL is based on set and relational operations with certain modifications and enhancements • A typical SQL query has the form: select A1 , A2 , …, An from R1 , R2 , …, Rm where P - Ai represent attributes - Ri represent relations - P is a predicate. • This query is equivalent to the relational algebra expression: A1, A2, …, An(P (R1 R2 … Rm)) • The result of an SQL query is a relation (but may contain duplicates). SQL statements can be nested

Projection The select clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query Find the names of all branches in the loan relation select branch-name from loan Equivalent to: I lbranch-name (loan) An asterisk in the select clause denotes all attributes select from loan Note: for our examples we use the tables: Branch(branch-name, branch-city, assets) Customer(customer-name, customer-street, customer-city) Loan(loan- number, amount, branch-name) Account(account-number, balance branch-name Borrower(customer-name, loan-number) Depositor(customer-name account-number COMP3311 Fall 2011 CSE, HKUST Slide 5
COMP3311 Fall 2011 CSE, HKUST Slide 5 Projection • The select clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query. • Find the names of all branches in the loan relation select branch-name from loan Equivalent to: branch-name(loan) • An asterisk in the select clause denotes “all attributes” select * from loan • Note: for our examples we use the tables: – Branch (branch-name, branch-city, assets) – Customer (customer-name, customer-street, customer-city) – Loan (loan-number, amount, branch-name) – Account (account-number, balance, branch-name) – Borrower (customer-name, loan-number) – Depositor (customer-name, account-number)

Duplicate Removal SQL allows duplicates in relations as well as in query results. Use select distinct to force the elimination of duplicates Find the names of all branches in the loan relation and remove duplicates select distinct branch-name force the dbms to remove duplicates from loan The keyword all specifies that duplicates are not removed select all branch-name force the dbms not from loan to remove duplicates COMP3311 Fall 2011 CSE, HKUST Slide 6
COMP3311 Fall 2011 CSE, HKUST Slide 6 Duplicate Removal • SQL allows duplicates in relations as well as in query results. Use select distinct to force the elimination of duplicates. Find the names of all branches in the loan relation, and remove duplicates select distinct branch-name from loan • The keyword all specifies that duplicates are not removed. select all branch-name from loan force the DBMS to remove duplicates force the DBMS not to remove duplicates

Arithmetic Operations on Retrieved Results The select clause can contain arithmetic expressions involving the operators tr- and x, and operating on constants or attributes of tuples The query: select branch-name, loan-number amount 100 from loan would return a relation which is the same as the loan relations except that the attribute amount is multiplied by 100 COMP3311 Fall 2011 CSE, HKUST Slide 7
COMP3311 Fall 2011 CSE, HKUST Slide 7 Arithmetic Operations on Retrieved Results • The select clause can contain arithmetic expressions involving the operators,+,−, and , and operating on constants or attributes of tuples. • The query: select branch-name, loan-number, amount * 100 from loan would return a relation which is the same as the loan relations, except that the attribute amount is multiplied by 100

The where clause The where clause specifies conditions that tuples in the relations in the from clause must satisfy Find all loan numbers for loans made at the perryridge branch with loan amounts greater than $1200 select loan-number from /oan where branch-name="perryridge"and amount>1200 SQL allows logical connectives and or, and not. arithmetic expressions can be used in the comparison operators Note: attributes used in a query(both select and where parts) must be defined in the relations in the from clause COMP3311 Fall 2011 CSE, HKUST Slide 8
COMP3311 Fall 2011 CSE, HKUST Slide 8 The where Clause • The where clause specifies conditions that tuples in the relations in the from clause must satisfy. • Find all loan numbers for loans made at the Perryridge branch with loan amounts greater than $1200. select loan-number from loan where branch-name=“Perryridge” and amount >1200 • SQL allows logical connectives and, or, and not. Arithmetic expressions can be used in the comparison operators. • Note: attributes used in a query (both select and where parts) must be defined in the relations in the from clause

The where Clause( cont SQL includes the between operator for convenience Find the loan number of those loans with loan amounts between 90,000and$100.000 that is,≥$90,000and≤$100,000) select loan-number from loan Where amount between 90000 and 100000 COMP3311 Fall 2011 CSE, HKUST Slide 9
COMP3311 Fall 2011 CSE, HKUST Slide 9 The where Clause (Cont.) • SQL includes the between operator for convenience. • Find the loan number of those loans with loan amounts between $90,000 and $100,000 (that is, $90,000 and $100,000) select loan-number from loan where amount between 90000 and 100000

The from Clause The from clause corresponds to the cartesian product operation of the relational algebra Find the cartesian product borrower x loan select x from borrower loan It is rarely used without a where clause Find the name and loan number of all customers having a loan at the perryridge branch select distinct customer-name borrower loan - number from borrower, loan where borrower,loan-number=loan, Joan-number and branch-name ="Perryridge COMP3311 Fall 2011 CSE, HKUST Slide 10
COMP3311 Fall 2011 CSE, HKUST Slide 10 The from Clause • The from clause corresponds to the Cartesian product operation of the relational algebra. • Find the Cartesian product borrower loan select * from borrower, loan It is rarely used without a where clause. • Find the name and loan number of all customers having a loan at the Perryridge branch. select distinct customer-name, borrower.loan-number from borrower, loan where borrower.loan-number = loan.loan-number and branch-name = “Perryridge
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 香港中文大学:Achieving Secure and Cooperative Wireless Networks with Trust Modeling and Game Theory.ppt
- 北京大学:《项目成本管理》课程教学资源(PPT课件讲稿)项目范围计划(主讲:周立新).ppt
- 《网上开店实务》课程教学资源(PPT讲稿)学习情境3 网店装修.ppt
- 中国科学技术大学:Linux内核源代码导读(PPT讲稿,陈香兰).ppt
- 《The C++ Programming Language》课程教学资源(PPT课件讲稿)Lecture 04 Object-Based Programming.ppt
- 北京航空航天大学:SimplyDroid - Efficient Event Sequence Simplification for Android Application.pptx
- 南京大学:《计算机图形学》课程教学资源(PPT课件讲稿)第7讲 图元填充与裁剪算法.pptx
- 香港浸会大学:Introduction to Linux and PC Cluster.ppt
- 《C语言程序设计》课程教学资源(PPT课件讲稿)第8章 结构体、共用体与枚举类型.ppt
- 《计算机系统安全》课程教学资源(PPT课件讲稿)第二章 黑客常用的系统攻击方法.ppt
- 上海交通大学:《挖掘海量数据集 Mining Massive Datasets》课程教学资源(PPT讲稿)Lecture 06 搜索引擎 Search Engines.ppt
- 《Introduction to Java Programming》课程PPT教学课件(Sixth Edition)Chapter 16 Applets and Multimedia.ppt
- 《计算机组装与维护》课程教学资源(PPT课件讲稿)第9章 BIOS设置(设置BIOS).ppt
- 香港城市大学:基序检测的随机化算法(PPT讲稿)Randomized Algorithm for Motif Detection.ppt
- 《数据结构》课程教学资源(PPT课件讲稿)第七章 图及其应用.ppt
- 3D Reconstruction from Images:Image-based Street-side City Modeling.ppt
- 大连理工大学:《计算机网络》课程教学资源(PPT课件讲稿)Chapter 2 应用层 application layer.ppt
- 四川大学:《操作系统 Operating System》课程教学资源(PPT课件讲稿)Chapter 3 Process Description and Control 3.4 Process Control 3.5 Execution of the Operating System 3.6 Unix SVR4 Process Management 3.7 Linux Process management system calls.ppt
- 《数据结构》课程教学资源(PPT课件讲稿)第七章 图 Graph.ppt
- 《数据结构》课程教学资源:实践教学大纲.doc
- 《程序设计基础》课程PPT教学课件(C++)第3讲 C++程序控制结构.ppt
- 四川大学:《数据库技术》课程教学资源(PPT课件讲稿)数据库设计.ppt
- 云计算 Cloud Computing(PPT讲稿)MapReduce进阶.ppt
- 《C语言程序设计》课程电子教案(PPT课件讲稿)第7章 用函数实现模块化程序设计.pptx
- 中国科学技术大学:云计算及安全(PPT讲稿)Cloud Computing & Cloud Security.pptx
- 中国科学技术大学:《信号与图像处理基础 Signal and Image Processing》课程教学资源(PPT课件讲稿)数字图像处理基础 Basics of Digital Image Processing.pptx
- 中国科学技术大学:《信号与图像处理基础 Signal and Image Processing》课程教学资源(PPT课件讲稿)图像压缩编码 Image Compression.pptx
- 《计算机组成原理》课程教学资源(PPT课件讲稿)第3章 计算机的算术运算.pptx
- Analysis of Algorithms(PPT讲稿)Data Structures and Data Management.ppt
- 《C语言程序设计》课程电子教案(PPT课件讲稿)第七章 数组.ppt
- 《计算机网络与因特网 Computer Networks and Internets》课程教学资源(PPT课件讲稿)第二讲 互联网应用软件.ppt
- 《计算机操作系统》课程教学资源(PPT课件讲稿)第四章 存储器管理.ppt
- 《单片机应用技术》课程PPT教学课件(C语言版)第10章 单片机测控接口.ppt
- 中国科技大学计算机系:《黑客反向工程》课程教学资源(PPT课件讲稿)黑客反向工程导论(陈凯明).ppt
- 香港科技大学:Record Linkage for Big Data.pptx
- 沈阳理工大学:《计算机网络》课程教学资源(PPT课件讲稿)第2章 IP技术.ppt
- 《编译技术》课程教学资源(PPT课件讲稿)第六章 运行时存储空间的组织和管理.ppt
- 《面向对象程序设计》课程教学大纲(适用专业:信息与计算科学).pdf
- 《Java Web应用开发技术与案例教程》教学资源(PPT讲稿)第7章 Java Web常用开发模式与案例.ppt
- 程序设计工具(PPT课件讲稿)Software Program Tool.ppt