电子科技大学：《统计学习理论及应用 Statistical Learning Theory and Applications》课程教学资源（课件讲稿，英文版）Lecture 07 Non-Linear Classification Model - Ensemble Methods

文档信息

资源类别：文库
文档格式：PDF
文档页数：56
文件大小：231.81KB
团购合买：点击进入团购

内容简介

1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting

Statistical Learning Theory and Applications Lecture 7 Non-Linear Classification Model- Ensemble Methods Instructor:Quan Wen SCSE@UESTC Fall,2021

Statistical Learning Theory and Applications Lecture 7 Non-Linear Classification Model - Ensemble Methods Instructor: Quan Wen SCSE@UESTC Fall, 2021

Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 1155

Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 1 / 55

Outline (Level 1) Basic principle Multiple classifier combination Bagging Boosting 2/55

Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 2 / 55

1.Basic principle In any application,we can use several learning algorithms The No Free Lunch Theorem:no single learning algorithm in any domains always introduces the most accurate learner o Try many and choose the one with the best cross-validation results 3/55

1. Basic principle In any application, we can use several learning algorithms The No Free Lunch Theorem: no single learning algorithm in any domains always introduces the most accurate learner Try many and choose the one with the best cross-validation results 3 / 55

Rationale -1 ●On the other hand" Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem finite data):each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently,which may lead to improved results? o Why it works? Suppose there are 25 base classifiers Each classifier has error rate,e =0.35 If the base classifiers are identical,thus dependent,then the ensemble will misclassify the same samples predicted incorrectly by the base classifiers. 4/55

Rationale - 1 On the other hand … Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem ( finite data): each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently, which may lead to improved results? Why it works? Suppose there are 25 base classifiers Each classifier has error rate, ε = 0.35 If the base classifiers are identical, thus dependent, then the ensemble will misclassify the same samples predicted incorrectly by the base classifiers. 4 / 55

Rationale -2 o Assume classifiers are independent,i.e.,their errors are uncorrelated.Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. o Probability that the ensemble classifier makes a wrong prediction: 25 () 1-e)25-1=0.06 =13 wrong probability correct probability Note:i 13,n =25,=0.35 binomial distribution. 5/55

Rationale - 2 Assume classifiers are independent, i.e., their errors are uncorrelated. Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. Probability that the ensemble classifier makes a wrong prediction: X 25 i=13 25 i ε i |{z} wrong probability (1 − ε) 25−i | {z } correct probability = 0.06 Note: i ≥ 13, n = 25, ε = 0.35 binomial distribution. 5 / 55

Works if… o The base classifiers should be independent o The base classifiers should do better than a classifier that performs random guess.(error 0.5) o In practice,it is hard to have base classifiers perfectly independent Nevertheless,improvements have been observed in ensemble methods when they are slightly correlated. 6/55

Works if … The base classifiers should be independent. The base classifiers should do better than a classifier that performs random guess. (error < 0.5) In practice, it is hard to have base classifiers perfectly independent. Nevertheless, improvements have been observed in ensemble methods when they are slightly correlated. 6 / 55

Rationale One important note is that: When we generate multiple base-learners,we want them to be reasonably accurate but do not require them to be very accurate individually,so they are not,and need not be,optimized separately for best accuracy. The base learners are not chosen for their accuracy,but for their simplicity. 7155

Rationale One important note is that: When we generate multiple base-learners, we want them to be reasonably accurate but do not require them to be very accurate individually, so they are not, and need not be, optimized separately for best accuracy. The base learners are not chosen for their accuracy, but for their simplicity. 7 / 55

Outline (Level 1) Basic principle 2 Multiple classifier combination Bagging Boosting 8/55

Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 8 / 55

2.Multiple classifier combination Average results from different models o Why? Better classification performance than individual classifiers More resilience to noise o Why not? ●Time consuming Overfitting 9/55

2. Multiple classifier combination Average results from different models Why? Better classification performance than individual classifiers More resilience to noise Why not? Time consuming Overfitting 9 / 55

共56页，可试读19页，点击继续阅读 ↓

刷新页面下载完整文档

VIP每日下载上限内不扣除下载券和下载次数；
按次数下载不扣除下载券；
注册用户24小时内重复下载只扣除一次；
顺序：VIP每日次数-->可用次数-->下载券；

点击下载完整版文档（PDF）