重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 5 Mining Frequent Patterns, Association and Correlations:Basic Concepts and Methods

Chapter 5: Mining Frequent Patterns, Association and Correlations: Basic Concepts and Methods ■ Basic concepts Frequent itemset Mining methods Which Patterns Are Interesting?Pattern Evaluation methods Summary
1 Chapter 5: Mining Frequent Patterns, Association and Correlations: Basic Concepts and Methods ◼ Basic Concepts ◼ Frequent Itemset Mining Methods ◼ Which Patterns Are Interesting?—Pattern Evaluation Methods ◼ Summary

What Is Frequent Pattern Analysis? Frequent pattern a pattern(a set of items subsequences substructures etc. that occurs frequently in a data set First proposed by agrawal, Imielinski, and Swami [ais93] in the context of frequent itemsets and association rule mining Motivation Finding inherent regularities in data What products were often purchased together? Beer and diapers? What are the subsequent purchases after buying a pc? What kinds of dna are sensitive to this new drug? Can we automatically classify web documents? Applications Basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log ( click stream) analysis and dna sequence analysis
2 What Is Frequent Pattern Analysis? ◼ Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set ◼ First proposed by Agrawal, Imielinski, and Swami [AIS93] in the context of frequent itemsets and association rule mining ◼ Motivation: Finding inherent regularities in data ◼ What products were often purchased together?— Beer and diapers?! ◼ What are the subsequent purchases after buying a PC? ◼ What kinds of DNA are sensitive to this new drug? ◼ Can we automatically classify web documents? ◼ Applications ◼ Basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis

Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions Exam ple of Association Rules TD ltems Bread. milk [Diaper> Beer), MIlk, Bread]>[Eggs, Coke), Bread, Diaper, Beer, Eggs Beer, Bread>(Milk 345 Milk, Diaper, Beer, Coke Bread, Milk, Diaper, beer Implication means co-occurrence Bread, Milk, Diaper, Coke not causality! 3
3 Association Rule Mining ◼ Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Example of Association Rules {Diaper} → {Beer}, {Milk, Bread} → {Eggs,Coke}, {Beer, Bread} → {Milk}, Implication means co-occurrence, not causality!

Why Is Freq Pattern Mining Important? Freq pattern An intrinsic and important property of datasets Foundation for many essential data mining tasks Association, correlation, and causality analysis Sequential, structural (e. g, sub-graph) patterns Pattern analysis in spatiotemporal, multimedia, time series, and stream data Classification: discriminative, frequent pattern analysis Cluster analysis: frequent pattern-based clustering Data warehousing iceberg cube and cube-gradient Semantic data compression fascicles Broad applications
4 Why Is Freq. Pattern Mining Important? ◼ Freq. pattern: An intrinsic and important property of datasets ◼ Foundation for many essential data mining tasks ◼ Association, correlation, and causality analysis ◼ Sequential, structural (e.g., sub-graph) patterns ◼ Pattern analysis in spatiotemporal, multimedia, timeseries, and stream data ◼ Classification: discriminative, frequent pattern analysis ◼ Cluster analysis: frequent pattern-based clustering ◼ Data warehousing: iceberg cube and cube-gradient ◼ Semantic data compression: fascicles ◼ Broad applications

Basic Concepts: Frequent Patterns id Items bought a itemset: a set of one or more Beer, Nuts, Diaper items 20 Beer, Coffee, Diaper k- itemset x={x1…,X} 30 Beer, Diaper, Eggs absolute) support, or, support 40 Nuts, Eggs, Milk count of X: Frequency or 50Nuts, Coffee, Diaper, Eggs, Milk occurrence of an itemset x Customer Customer (relative)support, s, is the buys both buys diaper fraction of transactions that contains X(i.e. the probability that a transaction contains X) An itemset X is frequent if Xs support is no less than a minsup Customer threshold buys beer 5
5 Basic Concepts: Frequent Patterns ◼ itemset: A set of one or more items ◼ k-itemset X = {x1 , …, xk} ◼ (absolute) support, or, support count of X: Frequency or occurrence of an itemset X ◼ (relative) support, s, is the fraction of transactions that contains X (i.e., the probability that a transaction contains X) ◼ An itemset X is frequent if X’s support is no less than a minsup threshold Customer buys diaper Customer buys both Customer buys beer Tid Items bought 10 Beer, Nuts, Diaper 20 Beer, Coffee, Diaper 30 Beer, Diaper, Eggs 40 Nuts, Eggs, Milk 50 Nuts, Coffee, Diaper, Eggs, Milk

Basic Concepts: Association Rules Tid Items bought Find all the rulesⅩ→ywit Beer, Nuts diaper 0000 minimum support and confidence Beer, Coffee, diaper Beer, Diaper, eggs support s, probability that a Nuts, Eggs, Milk transaction contains xu y 50Nuts, Coffee, Diaper, Eggs,Milk confidence, c conditional Customer Customer probability that a transaction lyS diaper having x also contains r Let minsup= 50%, minconf=50% Freg. Pat: Beer: 3, Nuts: 3, Diaper: 4, Eggs: 3, Customer Beer, Diaper]: 3 buys beer Association rules: (many more!) Beer> Diaper(60%, 100%) Diaper> Beer(60%,75%) 6
6 Basic Concepts: Association Rules ◼ Find all the rules X → Y with minimum support and confidence ◼ support, s, probability that a transaction contains X Y ◼ confidence, c, conditional probability that a transaction having X also contains Y Let minsup = 50%, minconf = 50% Freq. Pat.: Beer:3, Nuts:3, Diaper:4, Eggs:3, {Beer, Diaper}:3 Customer buys diaper Customer buys both Customer buys beer 40 Nuts, Eggs, Milk 50 Nuts, Coffee, Diaper, Eggs, Milk 30 Beer, Diaper, Eggs 20 Beer, Coffee, Diaper 10 Beer, Nuts, Diaper Tid Items bought ◼ Association rules: (many more!) ◼ Beer → Diaper (60%, 100%) ◼ Diaper → Beer (60%, 75%)

Basic Concepts: Frequent Patterns and Association rules Itemset X={X1,…× Find all the rules x with minimum support and confidence support,, s,probability sup port(X→Y)=P(r∪F) that a transaction contains x∪Y confidence, c conditional probability that a confidence(→=P(F|X) transaction having x also contains y
7 Basic Concepts: Frequent Patterns and Association Rules ◼ Itemset X = {x1 , …, xk} ◼ Find all the rules X → Y with minimum support and confidence ◼ support, s, probability that a transaction contains X Y ◼ confidence, c, conditional probability that a transaction having X also contains Y sup port(X Y) = P(X Y ) confidence(X Y ) = P(Y | X )

Basic Concepts: Frequent Patterns and Association Rules Transaction-id Items bought Let Supmin =50%, confmin =50% 10 A,B D 20 A,C,D Minimum support number is 3 30 A, D,E B, EF Freg. Pat: A: 3, B: 3, D: 4 E 3, AD: 3 50 B, C.DEF Association rules Customer Customer buys both A→D(60%,100%)3/5,1) buys diaper D→A(60%75%)3/5,3/4) Strong association rule Customer buys beer 8
8 Basic Concepts: Frequent Patterns and Association Rules Let supmin = 50%, confmin = 50% Minimum support number is 3 Freq. Pat.: {A:3, B:3, D:4, E:3, AD:3} Association rules: A → D (60%, 100%)(3/5,1) D → A (60%, 75%)(3/5,3/4) Strong association rule Customer buys diaper Customer buys both Customer buys beer Transaction-id Items bought 10 A, B, D 20 A, C, D 30 A, D, E 40 B, E, F 50 B, C, D, E, F

Association Rule Mining Task Given a set of transactions T, the goal of association rule mining is to find all rules having support> minsup threshold confidence minconf threshold Brute-force approach a List all possible association rules Compute the support and confidence for each rule Prune rules that fail the minsup and minconf thresholds Computationally prohibitive
9 Association Rule Mining Task ◼ Given a set of transactions T, the goal of association rule mining is to find all rules having ◼ support ≥ minsup threshold ◼ confidence ≥ minconf threshold ◼ Brute-force approach: ◼ List all possible association rules ◼ Compute the support and confidence for each rule ◼ Prune rules that fail the minsup and minconf thresholds Computationally prohibitive!

Mining Association Rules Example of Rules TD ltems Bread. milk MIlk,Diaper]> Beer)(s=0. 4, C=0.67) Bread, Diaper, Beer, eggs MIlk, Beer]->Diaper](s=0.4, C=1.0) 2345 Milk, Diaper, Beer, Coke [Diaper, Beer]->Milk](s=0.4, C=0.67) Bread, Milk, Diaper, beer Beer]>Milk, Diaper](s=0.4,C=0.67) DIaper)>Milk, Beer)(s=0. 4, C=0.5) Bread, Milk, Diaper, Coke (Milk)>Diaper, Beer (s=0.4, c=0.5) Observations All the above rules are binary partitions of the same itemset Milk, Diaper, Beery Rules originating from the same itemset have identical support but can have different confidence Thus, we may decouple the support and confidence requirements
10 Mining Association Rules Example of Rules: {Milk,Diaper} → {Beer} (s=0.4, c=0.67) {Milk,Beer} → {Diaper} (s=0.4, c=1.0) {Diaper,Beer} → {Milk} (s=0.4, c=0.67) {Beer} → {Milk,Diaper} (s=0.4, c=0.67) {Diaper} → {Milk,Beer} (s=0.4, c=0.5) {Milk} → {Diaper,Beer} (s=0.4, c=0.5) TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Observations: • All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} • Rules originating from the same itemset have identical support but can have different confidence • Thus, we may decouple the support and confidence requirements
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 4 OLAP - Data Warehousing and On-line Analytical Processing.ppt
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 3 Data Preprocessing.ppt
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 2 about data - Getting to Know Your Data.ppt
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 1 introduction.ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_第7章 机器人规划.ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_第6章 机器学习.ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_第5章 搜索策略.ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_第4章 智能计算(计算智能).ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_第3章 推理技术.ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_第2章 知识表示.ppt
- 重庆师范大学:《人工智能 AI》精品课程PPT教学课件_绪论、第1章 人工智能概述.ppt
- 重庆师范大学:《人工智能》精品课程PPT教学课件_VR虚拟现实和AR增强现实技术.ppt
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)09 Spark内存计算.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)08 流计算 Stream Computing.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)07 图计算 Graph Computing.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)06 HBase.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)05 HDFS.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)04 MapReduce.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)03 Hadoop.pdf
- 重庆大学:《大数据技术基础》课程教学资源(课件讲稿)02 大数据关键技术与挑战.pdf
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 6 Advanced Frequent Pattern Mining.ppt
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 7 Classification:Basic Concepts.ppt
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 8 Cluster Analysis:Basic Concepts and Methods.pptx
- 重庆大学:《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件(英文版)Chapter 9 Outlier Analysis.ppt
- 延安大学:《网页制作基础教程》课程教学资源_教学大纲.pdf
- 延安大学:《网页制作基础教程》学术论文_基于AJAX技术的Web模型在网站互动平台的应用研究.pdf
- 延安大学:《网页制作基础教程》学术论文_基于RIA技术的实验演示系统的设计与实现.pdf
- 延安大学:《网页制作基础教程》学术论文_服务器推技术在实验演示系统中的应用.pdf
- 延安大学:《网页制作基础教程》学术论文_用户行为驱动的网页布局自动调整的研究.pdf
- 《网页制作基础教程》参考书籍(PDF):JavaScript 权威指南(第四版).pdf
- 《网页制作基础教程》参考书籍(PDF):Python学习手册(第3版,涵盖Pathon 2.5).pdf
- 《网页制作基础教程》参考书籍:CSS Mastery 精通CSS书籍——高级WEB标准解决方案(人民邮电出版社).pdf
- 延安大学:《网页制作基础教程》课程PPT教学课件_第一章 网页结构(牛永洁).ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第二章 网页头部.ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第三章 格式化.ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第四章 列表的应用.ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第五章 使用图像与多媒体.ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第六章 使用超级链接.ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第七章 在网页中使用表格.ppt
- 延安大学:《网页制作基础教程》课程PPT教学课件_第八章 在网页中使用框架的使用.ppt