《数理统计》课程教学资源(参考资料)Large sample properties of MLE 02

30 Section 8.2.Asymptotic normality We assume that Xn=(X1,...,Xn),where the Xi's are i.i.d.with common density p(x;fo)∈P={p(x;0):0∈Θ}.We assume that 0 o is identified in the sense that if0≠0oand0∈Θ,then p(x;0)p(x;00)with respect to the dominating measure u. In order to prove asymptotic normality,we will need certain regularity conditions.Some of these were encountered in the proof of consistency,but we will need some additional assumptions
30 Section 8.2. Asymptotic normality We assume that Xn = (X1,...,Xn), where the Xi’s are i.i.d. with common density p(x; θ0) ∈ P = {p(x; θ) : θ ∈ Θ}. We assume that θ0 is identified in the sense that if θ = θ0 and θ ∈ Θ, then p(x; θ) = p(x; θ0) with respect to the dominating measure µ. In order to prove asymptotic normality, we will need certain regularity conditions. Some of these were encountered in the proof of consistency, but we will need some additional assumptions.

31 Regularity Conditions i.00 lies in the interior of which is assumed to be a compact subset of Rk. i.logp(x;0)is continuous at each0∈Θfor all x∈X(a.e.will suffice). ii.|logp(x;f)川≤d(x)for all0∈Θand Eo[d(X)】0 in a neighborhood,N,of 00. v.‖pgPI‖≤e(c)for all0∈Nand∫e(r)du(c)<o
31 Regularity Conditions i. θ0 lies in the interior of Θ, which is assumed to be a compact subset of Rk. ii. log p(x; θ) is continuous at each θ ∈ Θ for all x ∈ X (a.e. will suffice). iii. | log p(x; θ)| ≤ d(x) for all θ ∈ Θ and Eθ0 [d(X)] 0 in a neighborhood, N , of θ0. v. ∂p(x;θ) ∂θ ≤ e(x) for all θ ∈ N and e(x)dµ(x) < ∞

32 vi.Defining the score vector (x;0)=(6 log p(x;0)/∂91,.,0log p(x;0)/a0k)1 then we assume that I(00)=E(X;0o)(X;00)exists and is non-singular. vi.≤fa)for all8∈Vad Eo(X】<o. .iIe%2l≤g()for all9∈V and)du(o)<. vii Theorem 8.6:If these 8 regularity conditions hold,then )-0)N(0,1-(0))
32 vi. Defining the score vector ψ(x; θ)=(∂ log p(x; θ)/∂θ1,...,∂ log p(x; θ)/∂θk) then we assume that I(θ0) = Eθ0 [ψ(X; θ0)ψ(X; θ0)] exists and is non-singular. vii. ∂2 log p(x;θ) ∂θ∂θ ≤ f(x) for all θ ∈ N and Eθ0 [f(X)] < ∞. viii. ∂2p(x;θ) ∂θ∂θ ≤ g(x) for all θ ∈ N and g(x)dµ(x) < ∞. Theorem 8.6: If these 8 regularity conditions hold, then √n(ˆθ(Xn) − θ0) D(θ0) → N(0, I−1(θ0))

33 Proof:Note that conditions i.-iii.guarantee that the MLE is consistent.Since 0o is assumed to lie in the interior of we know that with sufficiently large probability that the MLE will lie in N and cannot be on the boundary.This implies that the maximum is also a local maximum,which implies that aQ(0(Xn);Xn)/0=0 or品∑-1(X;(Xn》=0.That is,the MLE is the solution to the score equations. By the mean value theorem,applied to each element of the score vector,we have that 0=ax:x》=X:o+(-iX小mX,-) i=1 Note that J(n)is a k x k random matrix where the jth row of the matrix is the jth row of In evaluated at n(Xn)where (n)is an intermediate value between (n)and 00.n(n) may be different from row to row but it will be consistent for 00
33 Proof: Note that conditions i. - iii. guarantee that the MLE is consistent. Since θ0 is assumed to lie in the interior of Θ, we know that with sufficiently large probability that the MLE will lie in N and cannot be on the boundary. This implies that the maximum is also a local maximum, which implies that ∂Q( ˆ θ(Xn); Xn)/∂θ = 0 or 1 n ni=1 ψ(Xi; ˆθ(Xn)) = 0. That is, the MLE is the solution to the score equations. By the mean value theorem, applied to each element of the score vector, we have that 0 = 1√n ni=1 ψ(Xi; ˆθ(Xn)) = 1√n ni=1 ψ(Xi; θ0)+{−J∗n(Xn)}√n(ˆθ(Xn)−θ0) Note that J∗n(Xn) is a k × k random matrix where the jth row of the matrix is the jth row of Jn evaluated at θ∗jn(Xn) where θ∗jn(Xn) is an intermediate value between ˆ θ(Xn) and θ0. θ∗jn(Xn) may be different from row to row but it will be consistent for θ0.

34 We will establish two facts: F1:(:0)N(0,I(0)) F2:J(Xn)P9oI(0)】 By assumption vi.,we know that I(0o)is non-singular.The inversion of a non-singular matrix is a continuous function in 0. Since (n)I(0o),we know that(n))I(00)-1. This also means that with sufficiently large probability,as n gets large,(Xn)is invertible. Therefore,we know that V0x-)=(Xa0X:则 TL i=1
34 We will establish two facts: F1: √1n ni=1 ψ(Xi; θ0) D(θ0) → N(0, I(θ0)) F2: J∗n(Xn) P (θ0) → I(θ0) By assumption vi., we know that I(θ0) is non-singular. The inversion of a non-singular matrix is a continuous function in θ. Since J∗n(Xn) P→ I(θ0), we know that {J∗n(Xn)}−1 P→ I(θ0)−1. This also means that with sufficiently large probability, as n gets large, J∗n(Xn) is invertible. Therefore, we know that √n(ˆθ(Xn) − θ0) = {J∗n(Xn)}−1 1√n ni=1 ψ(Xi; θ0)

35 We then use the Slutsky's theorem to conclude that √元(Xm)-0o)bN(0,I(0o)-1)
35 We then use the Slutsky’s theorem to conclude that √n(ˆθ(Xn) − θ0) D→ N(0, I(θ0)−1)

36 Establishing F1 The random vectors (X1;00),...(Xn;0o)are i.i.d.We need to show that they have mean zero.Then,I(00)will be the covariance matrix of (X;00)and an application of the multivariate central limit theorem for i.i.d.random vectors gives the desired result. We will show something stronger,namely Eol(X;0)]=0 for all 0EN.Condition v.guarantees that we can interchange integration and differentiation.Consider the case where k=1.We know that 1=fp(x;0)du(x)for all 0EN.This implies that 0=品∫p(r;f)du(c.Let's show that 是∫p(ac;)d(x)=∫品p(x;)du(x.Choose a sequence On∈W such that n0.Then,by definition of a derivative,we know that 0-典0a二0 for all rc de m→ 0m-0
36 Establishing F1 The random vectors ψ(X1; θ0),...ψ(Xn; θ0) are i.i.d. We need to show that they have mean zero. Then, I(θ0) will be the covariance matrix of ψ(X; θ0) and an application of the multivariate central limit theorem for i.i.d. random vectors gives the desired result. We will show something stronger, namely Eθ[ψ(X; θ)] = 0 for all θ ∈ N . Condition v. guarantees that we can interchange integration and differentiation. Consider the case where k = 1. We know that 1 = p(x; θ)dµ(x) for all θ ∈ N . This implies that 0 = ddθ p(x; θ)dµ(x). Let’s show that d dθ p(x; θ)dµ(x) = ddθ p(x; θ)dµ(x). Choose a sequence θn ∈ N such that θn → θ. Then, by definition of a derivative, we know that dp(x; θ) dθ = limn→∞{p(x; θn) − p(x; θ) θn − θ } for all x ∈ X

37 By the mean value theorem,we know that c8)=pe9+a28.- do where 0 lies between 0 and On so that 0N.This implies that 匹8二01=2≤ea 0m-0 de Since e(z)is integrable,we can employ the dominated convergence theorem.This says that 0=苏红o=/-ao 0n-0 im(elc:8,)-pl9}d() 0n-0 = dp0dμ(ae) de
37 By the mean value theorem, we know that p(x; θn) = p(x; θ) + dp(x; θ∗n) dθ (θn − θ) where θ∗n lies between θ and θn so that θ∗n ∈ N . This implies that |p(x; θn) − p(x; θ) θn − θ | = |dp(x; θ∗n) dθ | ≤ e(x) Since e(x) is integrable, we can employ the dominated convergence theorem. This says that 0 = d dθ p(x; θ)dµ(x) = limn→∞ {p(x; θn) − p(x; θ) θn − θ }dµ(x) = limn→∞{p(x; θn) − p(x; θ) θn − θ }dµ(x) = dp(x; θ) dθ dµ(x)

38 This can be generalized to partial derivatives which can then be used to formally show that Eol(X;0)]=0 for 0N.We know that∫p(r;f)du(a)-l.This implies that品∫p(ar;f)d(c)-0. By dominated convergence,we can interchange differentiation and integration so that(0.Then,we know that p(z:0)/20ipla:0)du()-0 p(x;0) We can divide by p(x;0)since it is greater than zero for all 0N. This implies that Eol(;)]=0
38 This can be generalized to partial derivatives which can then be used to formally show that Eθ[ψ(X; θ)] = 0 for θ ∈ N . We know that p(x; θ)dµ(x) = 1. This implies that ∂∂θj p(x; θ)dµ(x) = 0. By dominated convergence, we can interchange differentiation and integration so that ∂p(x;θ) ∂θj dµ(x) = 0. Then, we know that ∂p(x; θ)/∂θj p(x; θ) p(x; θ)dµ(x)=0 We can divide by p(x; θ) since it is greater than zero for all θ ∈ N . This implies that Eθ[ψj (X; θ)] = 0

39 Establishing F2 First,we shall study the large sample behavior of the matrix of second partial derivatives of the log-likelihood.Define 间-9 This is a k xk random matrix
39 Establishing F2 First, we shall study the large sample behavior of the matrix of second partial derivatives of the log-likelihood. Define Jn(θ)=[− 1n n i=1 ∂2 log p(Xi; θ) ∂θ∂θ ] This is a k × k random matrix.
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《数理统计》课程教学资源(参考资料)Large sample properties of MLE 01.pdf
- 《数理统计》课程教学资源(参考资料)An Inconsistent maximum likelihood estimate.pdf
- 《数理统计》课程教学资源(参考资料)Maximum Likelihood - An Introduction.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第五讲 点估计方法(二)极大似然估计方法.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第四讲 点估计方法(一)矩估计方法.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第三讲 指数族与充分完备统计量.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第二讲 统计量的分布(抽样分布).pdf
- 《数理统计》课程教学资源(参考资料)Hoeffding's Indequality 证明中的一个不等式的证明 complementary.pdf
- 《数理统计》课程教学资源(参考资料)Glivenko-Cantelli 定理的证明.pdf
- 中国科学技术大学:《数理统计》课程教学资源(PPT课件讲稿)统计学——统计数据的描述.ppt
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第一讲 总体与样本 What Statistics can do(主讲:张伟平).ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第六章 方差分析(二)第二节 单因素试验资料的方差分析、第三节 两因素试验资料的方差分析.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第六章 方差分析(三)第四节 方差分析的数学模型与期望均方、第五节 数据转换.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第六章 方差分析 Analysis of Variance(一)第一节 方差分析的基本原理与步骤.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第七章 次数资料分析——X2检验.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第四章 常用概率分布.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第五章 假设检验(二).ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第五章 假设检验(一)均数差异显著性检验——t检验.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第二章 资料的整理.ppt
- 佛山科学技术学院:《生物统计附试验设计》课程电子教案(PPT课件讲稿)第三章 资料的统计描述.ppt
- 《数理统计》课程教学资源(参考资料)Large sample properties of MLE 03.pdf
- 《数理统计》课程教学资源(参考资料)Large sample properties of MLE 04.pdf
- 《数理统计》课程教学资源(参考资料)Large sample properties of MLE 05.pdf
- 《数理统计》课程教学资源(参考资料)THE MM, ME, ML, EL, EF AND GMM APPROACHES TO ESTIMATION - A SYNTHESIS.pdf
- 《数理统计》课程教学资源(参考资料)Another scratch proof of consistency and asymptotical normal of MLE.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第六讲 点估计方法(三)一致最小方差无偏估计.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第七讲 区间估计(一)置信区间.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第八讲 区间估计(二)容忍区间.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第九讲 参数假设检验(一).pdf
- 《数理统计》课程教学资源(参考资料)How do we do hypothesis testing.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第十讲 参数假设检验(二).pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第十一讲 参数假设检验(三).pdf
- 《数理统计》课程教学资源(参考资料)Likelihood Ratio, Wald, and(Rao)Score Tests.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第十二讲 非参数检验(一).pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第十二讲 非参数检验(二).pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第十三讲 Bayes统计初步(Bayes方法和统计决策理论).pdf
- 《数理统计》课程教学资源(参考资料)THE FORMAL DEFINITION OF REFERENCE PRIORS, ANNALS, 2009.pdf
- 《数理统计》课程教学资源(参考资料)Bayes Factor - What They Are and What They Are Not.pdf
- 中国科学技术大学:《数理统计》课程教学资源(课件讲义)第十四讲 回归分析(线性回归模型).pdf
- 中国科学技术大学:《实用统计软件》课程课件讲义(统计计算与软件)第一讲 R语言基础(一).pdf