电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 02 Raw Data Analysis and Pre-processing(2.1-2.4)

Lecture 2 Raw Data Analysis and Pre-processing Dr.李晓瑜Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring SunData Group http://www.sundatagroup.org School of Information and Software Engineering,UESTC 1966 Copyright2019 by Xiaoyu Li
Dr.李晓瑜 Xiaoyu Li Email:xiaoyuuestc@uestc.edu.cn http://blog.sciencenet.cn/u/uestc2014xiaoyu 2019-Spring Lecture 2 Raw Data Analysis and Pre-processing SunData Group http://www.sundatagroup.org/ School of Information and Software Engineering, UESTC Copyright © 2019 by Xiaoyu Li. 1

GunData Group Content (6H) .2.1 Overview of data types .2.2 Review of Data pre-processing tools and platforms 2.3 Clean,storage and management of raw data 2.4 Collections of data analysis and data mining 4 Copyright 2019 by Xiaoyu Li
Content(6H) 2.1 Overview of data types 2.2 Review of Data pre-processing tools and platforms 2.3 Clean, storage and management of raw data 2.4 Collections of data analysis and data mining Copyright © 2019 by Xiaoyu Li. 4

Group Target JATA Obtain the work flow of raw data to clean and pre-process. Know some useful data processing tools and platforms. 5 Copyright 2019 by Xiaoyu Li
Target Obtain the work flow of raw data to clean and pre-process. Know some useful data processing tools and platforms. Copyright © 2019 by Xiaoyu Li. 5

Data Science Process Exploratory Data Analysis Raw Data Data Is Clean Collected Processed Dataset Models Algorithms Data Communicate Visualize Make Product Report Decisions ATA 6 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 6 Data Science Process

2.1 Overview of data types What's data? ●What's data types? 。Date:1980-1-1 。Time:20:08:12 ·Age:65 years .Colors:Red,Black,Blue,Green,White,... ·Name:Xiaoyu Li.. 。Symbols:%,&,#,*,… 7 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 7 2.1 Overview of data types What’s data? What’s data types? Date:1980-1-1 Time:20:08:12 Age: 65 years Colors: Red, Black, Blue, Green, White,… Name: Xiaoyu Li…. Symbols: %, &, #, *, …

(1)Basic Terms ·Data a set of values of qualitative or quantitative variables; restated,pieces of data are individual pieces of information; ●Dataset a collection of data,lists values for each of the variables; Data object a location in memory having a value and possibly referenced by an identifier; Points,vectors,patterns,samples,observations.... DATA 8 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 8 (1)Basic Terms Data a set of values of qualitative or quantitative variables; restated, pieces of data are individual pieces of information; Dataset a collection of data, lists values for each of the variables; Data object a location in memory having a value and possibly referenced by an identifier; Points, vectors, patterns, samples, observations…

(2)General definition of data types ●In computer science and computer programming,a data type or simply type is a classification identifying one of various types of data,such as real,integer or Boolean,that determines the possible values for that type;the operations that can be done on values of that type;the meaning of the data;and the way values of that type can be stored. DATA 9 Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 9 (2) General definition of data types In computer science and computer programming, a data type or simply type is a classification identifying one of various types of data, such as real, integer or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of that type can be stored

(3)Common data types Statistics jc real-valued (interval scale) floating-point real-valued (ratio scale) count data (usually non-negative) integer ●Integers, binary data Boolean categorical data enumerated type ●Booleans, random vector list or array random matrix two-dimensional array ●Characters,. random tree tree Floating-point numbers, Alpha numeric strings. 10 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 10 (3) Common data types Integers, Booleans, Characters, Floating-point numbers, Alpha numeric strings

(4)Classes of data types Primitive data types ●Composite types .Other types Abstract data types ●Utility types 11 DATA Copyright 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 11 (4) Classes of data types Primitive data types Composite types Other types Abstract data types Utility types

1)Primitive data types 。Character character, char ) Integer (integer,int, short, long, byte with a variety of precisions: Floating-point number float, double, real,double precision ) Fixed-point number (fixed with a variety of precisions and a programmer-selected scale. Boolean,logical values true and false. Reference (also called a pointer or handle),a small value referring to another object's address in memory,possibly a much larger one. More sophisticated types which can be built-in include: 。Tuple in MI,Python 。List in Lisp Complex number in Fortran,C (C99),Lisp,Python,Perl 6,D Rational number in Lisp,Perl 6 Associative array in various guises,in Lisp,Perl,Python,Lua,D First-class function,closure,continuation in languages that support functional programming such as Lisp,ML,Perl 6,D and C#3.0 ATA 12 Copyright C 2019 by Xiaoyu Li
Copyright © 2019 by Xiaoyu Li. 12 1)Primitive data types
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 02 Raw Data Analysis and Pre-processing(2.5-2.7).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 01 Overview Data Analysis and Data Mining(李晓瑜).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子降维算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子神经网络(Neural Network,NN).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子支持向量机(support vector machine, SVM).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子机器学习(量子K-means算法).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)隐马尔科夫算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)降维算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)分类算法(朱钦圣).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)聚类算法.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)量子力学.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)决策树.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)线性模型.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)模型评估与选择.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)绪论.pdf
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 25 软件开发的新方法 New Methodology(Agile方法).ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 24 软件工程中的高级课题 Advanced Topics in Software Engineering.ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 23 软件过程、管理与质量 Software Process, Management, and Quality.ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 22 面向对象软件工程 Object-Oriented Software Engineering(Unified Modeling Language, UML).ppt
- 南京大学:《软件工程 Software Engineering》课程教学资源(PPT课件讲稿)Part 21 传统软件工程方法 Conventional Methods for Software Engineering.ppt
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 03 Regression Analysis(Logistic Regression).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 03 Regression Analysis and Classification.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 05 Clustering Analysis.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 04 Association Rules of Data Reasoning(Apriori Algorithm、Improve of Apriori Algorithm).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 04 Association Rules of Data Reasoning(FP-growth Algorithm).pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 04 Association Rules of Data Reasoning.pdf
- 电子科技大学:《数据分析与数据挖掘 Data Analysis and Data Mining》课程教学资源(课件讲稿)Lecture 06 Classification.pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第一章 算法概述 Algorithm Introduction(刘瑶、陈佳).pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第二章 递归与分治策略.pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第三章 动态规划 Dynamic Programming.pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第四章 贪心算法(Greedy Algorithm).pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第五章 回朔法(Backtracking Algorithm).pdf
- 电子科技大学:《算法设计与分析 Algorithms Design and Analysis》课程教学资源(课件讲稿)第六章 分支限界法(Branch and Bound Method).pdf
- 上饶师范学院:《数据库系统原理 An Introduction to Database System》课程教学资源(电子教案,颜清).doc
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)01 Introduction(肖鸣宇).pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)Stable Matching.pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)02 Basics of algorithm design & analysis.pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)03 Maximum Flow.pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)04 NP and Computational Intractability.pdf
- 电子科技大学:《算法设计与分析 Design and Analysis of Algorithms》研究生课程教学资源(课件讲稿,英文版)05 Approximation Algorithms.pdf