电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 01 Overview Data Analysis and Data Mining(李晓瑜)

Lecture 1 Overview Data Analysis and Data Mining Dr.李晓瑜Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring SunData Group http://www.sundatagroup.org School of Information and Software Engineering,UESTC 1966 Copyright2019 by Xiaoyu Li
Dr.李晓瑜 Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring Lecture 1 Overview Data Analysis and Data Mining SunData Group http://www.sundatagroup.org/ School of Information and Software Engineering, UESTC Copyright © 2019 by Xiaoyu Li. 1

C3at3e美0是10 Content (3H) ●1.1What's big data? 1.2 Overview of data analysis 1.3 Overview of data mining 1.4 Make requirement for different professional applications 3 Copyright 2019 by Xiaoyu Li
Content(3H) 1.1 What’s big data? 1.2 Overview of data analysis 1.3 Overview of data mining 1.4 Make requirement for different professional applications Copyright © 2019 by Xiaoyu Li. 3

sunData Groun Reference ·Text Book 数据挖掘 数据挖掘 概念与技术 实用机器学习技术 Data Mining,Jiawei Han,Micheline Kamber and Jian Pei,Mechanical industry press(2012) DATA MINING ·Reference Book 1)Tamhane,Ajit C.,and Dorothy D.Dunlop Statistics and Data Analysis:From Elementary to Intermediate.Prentice Hall,1999. 集体智慧 2)统计学习方法(李航) 佛计学习方法 编程 。Couresa 1)Machine Learning (Andrew Ng) 2)Data Mining (Stanford) nn出 ORE了 型4出 3)Statistical Thinking and Data Analysis (MIT) 4 Copyright 2019 by Xiaoyu Li
Reference Copyright © 2019 by Xiaoyu Li. 4 Text Book Data Mining, Jiawei Han, Micheline Kamber and Jian Pei, Mechanical industry press(2012) Reference Book 1)Tamhane, Ajit C., and Dorothy D. Dunlop. Statistics and Data Analysis: From Elementary to Intermediate. Prentice Hall, 1999. 2)统计学习方法(李航) Couresa 1)Machine Learning(Andrew Ng) 2)Data Mining(Stanford) 3)Statistical Thinking and Data Analysis (MIT)

GunData Groun Target 1 Know the characteristics of big data; 2 Clear how to get the data analysis requirements; 3 Know the differences and correlations between data analysis and data mining. 5 Copyright 2019 by Xiaoyu Li
Target 1 Know the characteristics of big data; 2 Clear how to get the data analysis requirements; 3 Know the differences and correlations between data analysis and data mining. Copyright © 2019 by Xiaoyu Li. 5

Big Data BIG DATA ERA IS COMING 6 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 6 Big Data

1.1 What's big data? 7 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 7 1.1 What’s big data?

(1)Background Global Information Storage Capacity 2007 ANALOG 19 exabytes in optimally compressed bytes .Paper,film,audiotape and vinyl:6% Analogvideotapes (VHS,etc):94%ANALOG Portable media,flash drives:2% Portable hard disks:2.4% DIGITAL CDs and minidisks 6.8% Computer servers and mainframes:8.9% 2000 Digital tape:11.8% 1986 1993 ANALOG 2.6 exabytes ANALOG STORAGE DVD/Blu-ray:22.8% DIGITAL DIGITAL STORAGE 0.02 exabytes PC hard disks:44.5% 2002: 123 billion gigabytes “beginning of the digital age" 50% %digital: Others:1%(incl.chip cards memory cards floppy disks mobile phones,PDAs,cameras/camcorders,video games) 1% 3% 25% 94% DIGITAL 8 Source:Hilbert,M.,Lopez,P.(2011).The World's Technological Capacityto Store,Communicate,and 280 exabytes Compute Information.Science,332(6025),60-65.http://www.martinhilbert.net/WorldinfoCapacity.html
8 (1) Background

(2)Development Media/Entertainm Healthcare 6 BILLION = 87% nro wadwide 时hewn时sgpd 1.01 BILLION 604 MILLION a00045年0南d0 g1-in mathy temmobis de在5 90% 400 MILLION = 84 MILLION Gbutidihao01.02-2000cCbutadam> DNA fMRI/DTI Messenger Watch oFanenta TCCAGGTAGTGGACGTTACACCTAc CATGGCTCCTCCACCTAACCAGCAG 6M3:W代2hS Gene GTATGGACAGCAATATGGGCAACAA 根为有用y物 90n05000t女0t ACCAGGTccrcccccTArGGcTTAT f14714:34k12o正台Mn2 BIG Sequence Industry DATA E-commerce "o w Sensor Manufacture Wall Mart:2.5 PB/hour Stock Data ATA 9 Copyright 2019 by Xiaoyu Li. *Note:some pictures derived from internet
Copyright © 2019 by Xiaoyu Li. 9 (2) Development fMRI/ DTI Stock Data BIG DATA Media/Entertainm et Wall Mart: 2.5 PB/hour Industry Healthcare DNA *Note: some pictures derived from internet E-commerce Gene Sequence Messenger Watch Sensor Manufacture

(3)Data Stream Internet Surveillance SRAM SPAM FILTER Spam Filtering DATA Network Intrusion Industry STREAM Mobile Smart Sensor Phone *Note:some pictures derived from internet ATA 10 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 10 (3) Data Stream DATA STREAM Internet Industry Surveillance Sensor Network Intrusion Smart Phone Spam Filtering Mobile *Note: some pictures derived from internet

(4)Useful Applications 圭 中国南方电网 国家电网 STATE GRID 中石C OIL opec 中国石油 中海石油 ATA 11 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 11 (4) Useful Applications
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子降维算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子神经网络(Neural Network,NN).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子支持向量机(support vector machine, SVM).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子机器学习(量子K-means算法).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)隐马尔科夫算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)降维算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)分类算法(朱钦圣).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)聚类算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子力学.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)决策树.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)线性模型.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)模型评估与选择.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)绪论.pdf
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 25 软件开发的新方法 New Methodology(Agile方法).ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 24 软件工程中的高级课题 Advanced Topics in Software Engineering.ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 23 软件过程、管理与质量 Software Process, Management, and Quality.ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 22 面向对象软件工程 Object-Oriented Software Engineering(Unified Modeling Language, UML).ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 21 传统软件工程方法 Conventional Methods for Software Engineering.ppt
- 《软件工程 Software Engineering》课程教学资源:软件文档编写指南.doc
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)第三部分 软件过程、管理与质量.ppt
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 02 Raw Data Analysis and Pre-processing(2.5-2.7).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 02 Raw Data Analysis and Pre-processing(2.1-2.4).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 03 Regression Analysis(Logistic Regression).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 03 Regression Analysis and Classification.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 05 Clustering Analysis.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 04 Association Rules of Data Reasoning(Apriori Algorithm、Improve of Apriori Algorithm).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 04 Association Rules of Data Reasoning(FP-growth Algorithm).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 04 Association Rules of Data Reasoning.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 06 Classification.pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第一章 算法概述 Algorithm Introduction(刘瑶、陈佳).pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第二章 递归与分治策略.pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第三章 动态规划 Dynamic Programming.pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第四章 贪心算法(Greedy Algorithm).pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第五章 回朔法(Backtracking Algorithm).pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第六章 分支限界法(Branch and Bound Method).pdf
- 上饶师范学院:《数据库系统原理 An Introduction to Database System》课程教学资源(电子教案,颜清).doc
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)01 Introduction(肖鸣宇).pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)Stable Matching.pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)02 Basics of algorithm design & analysis.pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)03 Maximum Flow.pdf