中国高校课件下载中心 》 教学资源 》 大学文库

电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 01 Overview Data Analysis and Data Mining(李晓瑜)

文档信息
资源类别:文库
文档格式:PDF
文档页数:57
文件大小:4.92MB
团购合买:点击进入团购
内容简介
 1.1 What’s big data?  1.2 Overview of data analysis  1.3 Overview of data mining  1.4 Make requirement for different professional applications
刷新页面文档预览

Lecture 1 Overview Data Analysis and Data Mining Dr.李晓瑜Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring SunData Group http://www.sundatagroup.org School of Information and Software Engineering,UESTC 1966 Copyright2019 by Xiaoyu Li

Dr.李晓瑜 Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring Lecture 1 Overview Data Analysis and Data Mining SunData Group http://www.sundatagroup.org/ School of Information and Software Engineering, UESTC Copyright © 2019 by Xiaoyu Li. 1

C3at3e美0是10 Content (3H) ●1.1What's big data? 1.2 Overview of data analysis 1.3 Overview of data mining 1.4 Make requirement for different professional applications 3 Copyright 2019 by Xiaoyu Li

Content(3H)  1.1 What’s big data?  1.2 Overview of data analysis  1.3 Overview of data mining  1.4 Make requirement for different professional applications Copyright © 2019 by Xiaoyu Li. 3

sunData Groun Reference ·Text Book 数据挖掘 数据挖掘 概念与技术 实用机器学习技术 Data Mining,Jiawei Han,Micheline Kamber and Jian Pei,Mechanical industry press(2012) DATA MINING ·Reference Book 1)Tamhane,Ajit C.,and Dorothy D.Dunlop Statistics and Data Analysis:From Elementary to Intermediate.Prentice Hall,1999. 集体智慧 2)统计学习方法(李航) 佛计学习方法 编程 。Couresa 1)Machine Learning (Andrew Ng) 2)Data Mining (Stanford) nn出 ORE了 型4出 3)Statistical Thinking and Data Analysis (MIT) 4 Copyright 2019 by Xiaoyu Li

Reference Copyright © 2019 by Xiaoyu Li. 4  Text Book  Data Mining, Jiawei Han, Micheline Kamber and Jian Pei, Mechanical industry press(2012)  Reference Book 1)Tamhane, Ajit C., and Dorothy D. Dunlop. Statistics and Data Analysis: From Elementary to Intermediate. Prentice Hall, 1999. 2)统计学习方法(李航)  Couresa 1)Machine Learning(Andrew Ng) 2)Data Mining(Stanford) 3)Statistical Thinking and Data Analysis (MIT)

GunData Groun Target 1 Know the characteristics of big data; 2 Clear how to get the data analysis requirements; 3 Know the differences and correlations between data analysis and data mining. 5 Copyright 2019 by Xiaoyu Li

Target  1 Know the characteristics of big data;  2 Clear how to get the data analysis requirements;  3 Know the differences and correlations between data analysis and data mining. Copyright © 2019 by Xiaoyu Li. 5

Big Data BIG DATA ERA IS COMING 6 Copyright 2019 by Xiaoyu Li

Copyright © 2019 by Xiaoyu Li. 6 Big Data

1.1 What's big data? 7 Copyright 2019 by Xiaoyu Li

Copyright © 2019 by Xiaoyu Li. 7 1.1 What’s big data?

(1)Background Global Information Storage Capacity 2007 ANALOG 19 exabytes in optimally compressed bytes .Paper,film,audiotape and vinyl:6% Analogvideotapes (VHS,etc):94%ANALOG Portable media,flash drives:2% Portable hard disks:2.4% DIGITAL CDs and minidisks 6.8% Computer servers and mainframes:8.9% 2000 Digital tape:11.8% 1986 1993 ANALOG 2.6 exabytes ANALOG STORAGE DVD/Blu-ray:22.8% DIGITAL DIGITAL STORAGE 0.02 exabytes PC hard disks:44.5% 2002: 123 billion gigabytes “beginning of the digital age" 50% %digital: Others:1%(incl.chip cards memory cards floppy disks mobile phones,PDAs,cameras/camcorders,video games) 1% 3% 25% 94% DIGITAL 8 Source:Hilbert,M.,Lopez,P.(2011).The World's Technological Capacityto Store,Communicate,and 280 exabytes Compute Information.Science,332(6025),60-65.http://www.martinhilbert.net/WorldinfoCapacity.html

8 (1) Background

(2)Development Media/Entertainm Healthcare 6 BILLION = 87% nro wadwide 时hewn时sgpd 1.01 BILLION 604 MILLION a00045年0南d0 g1-in mathy temmobis de在5 90% 400 MILLION = 84 MILLION Gbutidihao01.02-2000cCbutadam> DNA fMRI/DTI Messenger Watch oFanenta TCCAGGTAGTGGACGTTACACCTAc CATGGCTCCTCCACCTAACCAGCAG 6M3:W代2hS Gene GTATGGACAGCAATATGGGCAACAA 根为有用y物 90n05000t女0t ACCAGGTccrcccccTArGGcTTAT f14714:34k12o正台Mn2 BIG Sequence Industry DATA E-commerce "o w Sensor Manufacture Wall Mart:2.5 PB/hour Stock Data ATA 9 Copyright 2019 by Xiaoyu Li. *Note:some pictures derived from internet

Copyright © 2019 by Xiaoyu Li. 9 (2) Development fMRI/ DTI Stock Data BIG DATA Media/Entertainm et Wall Mart: 2.5 PB/hour Industry Healthcare DNA *Note: some pictures derived from internet E-commerce Gene Sequence Messenger Watch Sensor Manufacture

(3)Data Stream Internet Surveillance SRAM SPAM FILTER Spam Filtering DATA Network Intrusion Industry STREAM Mobile Smart Sensor Phone *Note:some pictures derived from internet ATA 10 Copyright 2019 by Xiaoyu Li

Copyright © 2019 by Xiaoyu Li. 10 (3) Data Stream DATA STREAM Internet Industry Surveillance Sensor Network Intrusion Smart Phone Spam Filtering Mobile *Note: some pictures derived from internet

(4)Useful Applications 圭 中国南方电网 国家电网 STATE GRID 中石C OIL opec 中国石油 中海石油 ATA 11 Copyright 2019 by Xiaoyu Li

Copyright © 2019 by Xiaoyu Li. 11 (4) Useful Applications

刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档