电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 07 Non-Linear Classification Model - Ensemble Methods

Statistical Learning Theory and Applications Lecture 7 Non-Linear Classification Model- Ensemble Methods Instructor:Quan Wen SCSE@UESTC Fall,2021
Statistical Learning Theory and Applications Lecture 7 Non-Linear Classification Model - Ensemble Methods Instructor: Quan Wen SCSE@UESTC Fall, 2021

Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 1155
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 1 / 55

Outline (Level 1) Basic principle Multiple classifier combination Bagging Boosting 2/55
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 2 / 55

1.Basic principle In any application,we can use several learning algorithms The No Free Lunch Theorem:no single learning algorithm in any domains always introduces the most accurate learner o Try many and choose the one with the best cross-validation results 3/55
1. Basic principle In any application, we can use several learning algorithms The No Free Lunch Theorem: no single learning algorithm in any domains always introduces the most accurate learner Try many and choose the one with the best cross-validation results 3 / 55

Rationale -1 ●On the other hand" Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem finite data):each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently,which may lead to improved results? o Why it works? Suppose there are 25 base classifiers Each classifier has error rate,e =0.35 If the base classifiers are identical,thus dependent,then the ensemble will misclassify the same samples predicted incorrectly by the base classifiers. 4/55
Rationale - 1 On the other hand … Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem ( finite data): each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently, which may lead to improved results? Why it works? Suppose there are 25 base classifiers Each classifier has error rate, ε = 0.35 If the base classifiers are identical, thus dependent, then the ensemble will misclassify the same samples predicted incorrectly by the base classifiers. 4 / 55

Rationale -2 o Assume classifiers are independent,i.e.,their errors are uncorrelated.Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. o Probability that the ensemble classifier makes a wrong prediction: 25 () 1-e)25-1=0.06 =13 wrong probability correct probability Note:i 13,n =25,=0.35 binomial distribution. 5/55
Rationale - 2 Assume classifiers are independent, i.e., their errors are uncorrelated. Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. Probability that the ensemble classifier makes a wrong prediction: X 25 i=13 25 i ε i |{z} wrong probability (1 − ε) 25−i | {z } correct probability = 0.06 Note: i ≥ 13, n = 25, ε = 0.35 binomial distribution. 5 / 55

Works if… o The base classifiers should be independent o The base classifiers should do better than a classifier that performs random guess.(error 0.5) o In practice,it is hard to have base classifiers perfectly independent Nevertheless,improvements have been observed in ensemble methods when they are slightly correlated. 6/55
Works if … The base classifiers should be independent. The base classifiers should do better than a classifier that performs random guess. (error < 0.5) In practice, it is hard to have base classifiers perfectly independent. Nevertheless, improvements have been observed in ensemble methods when they are slightly correlated. 6 / 55

Rationale One important note is that: When we generate multiple base-learners,we want them to be reasonably accurate but do not require them to be very accurate individually,so they are not,and need not be,optimized separately for best accuracy. The base learners are not chosen for their accuracy,but for their simplicity. 7155
Rationale One important note is that: When we generate multiple base-learners, we want them to be reasonably accurate but do not require them to be very accurate individually, so they are not, and need not be, optimized separately for best accuracy. The base learners are not chosen for their accuracy, but for their simplicity. 7 / 55

Outline (Level 1) Basic principle 2 Multiple classifier combination Bagging Boosting 8/55
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 8 / 55

2.Multiple classifier combination Average results from different models o Why? Better classification performance than individual classifiers More resilience to noise o Why not? ●Time consuming Overfitting 9/55
2. Multiple classifier combination Average results from different models Why? Better classification performance than individual classifiers More resilience to noise Why not? Time consuming Overfitting 9 / 55
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 06 Multilayer Perceptron.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 05 Support Vector Machine.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 04 Perceptron.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 03 Regression Models.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 02 Review of Linear Algebra and Probability Theory.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 01 Introduction.pdf
- 安顺学院:《经济统计学》专业新增学士学位授予权评审汇报PPT(吴永武).ppt
- 对外经济贸易大学:《应用统计 Applied Statistics》课程教学资源(教案讲稿).pdf
- 对外经济贸易大学:《应用统计 Applied Statistics》课程教学资源(教学大纲).pdf
- 上海交通大学:《统计原理 Principal of statistics》课程教学资源_大脑衰老与吃兴奋功能食品关系研究(调查问卷).doc
- 上海交通大学:《统计原理 Principal of statistics》课程教学资源_课后作业答案.doc
- 上海交通大学:《统计原理 Principal of statistics》课程教学资源_课后习题解答.doc
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)北京大学《统计软件SAS教程》(李东风).pdf
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)数据分析与EVIEWS应用(易丹辉).pdf
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)SPSS18.0教程(SPSS统计与分析).pdf
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)R语言实战(中文完整版).pdf
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)Matlab基础及其应用教程.pdf
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)MATLAB2013超强教程.pdf
- 《统计原理 Principal of statistics》课程教学资源(统计软件教程)Excel统计分析实例精讲.pdf
- 上海交通大学:《统计原理 Principal of statistics》课程教学资源_统计原理练习题(放大解答).pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 08 Data Representation - Parametric Model.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 09 Data Representation — Non-Parametric Model.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿,英文版)Lecture 10 Unsupervised Learning.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第一讲 概述(文泉、陈娟).pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第二讲 概率与线性代数回顾.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第三讲 回归模型.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第四讲 感知机.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第五讲 支持向量机.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第六讲 非线性分类模型——多层感知机.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第七讲 非线性分类模型——集成方法.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第八讲 数据表示——含参模型.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第九讲 数据表示——不含参模型.pdf
- 电子科技大学:《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源(课件讲稿)第十讲 非监督学习.pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第10章 随机过程在保险精算中的应用.pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第11章 Markov链Monte Carlo方法.pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第1章 预备知识(张波、商豪、邓军).pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第2章 随机过程的基本概念和类型.pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第3章 Poisson过程.pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第4章 更新过程.pdf
- 中国人民大学:《应用随机过程 Applied Stochastic Processes》课程教学资源(课件讲稿)第5章 Markov链.pdf