中国高校课件下载中心 》 教学资源 》 大学文库

《矩阵计算》课程教学资源(文献资料)The decompositional approach to matrix computation(Stewart, 2000)

文档信息
资源类别:文库
文档格式:PDF
文档页数:10
文件大小:537.57KB
团购合买:点击进入团购
内容简介
《矩阵计算》课程教学资源(文献资料)The decompositional approach to matrix computation(Stewart, 2000)
刷新页面文档预览

gtn THE DECOMPOSITIONAL APPROACH TO MATRIX COMPUTATION The introduction of matrix decomposition into numerical linear algebra revolutionized matrix computations.This article outlines the decompositional approach,comments on its history,and surveys the six most widely used decompositions. n 1951,Paul S.Dwyer published Linear used scalar equations whereas Householder used lecomnosition-the factorization of a matris cal calculators.Nonetheless,the book was state into the product of lower and upper triangular matrices 10544 ples of Nmerical Analysis,one of the first mod- proach is that t isnot the ess of the matrix algorithmists tosolve par is strik- ing.The most obvious difference is that Dwyer swing by the mid-1960s,has revolutionized ma- itiona 1521-9%15001000200E uences.I begin with a G.W.STEWART ofthe Choeyh roach and its cons University of Maryland 50 COMPUTING IN SCIENCE ENGINEERING

I n 1951, Paul S. Dwyer published Linear Computations, perhaps the first book de￾voted entirely to numerical linear algebra.1 Digital computing was in its infancy, and Dwyer focused on computation with mechani￾cal calculators. Nonetheless, the book was state of the art. Figure 1 reproduces a page of the book dealing with Gaussian elimination. In 1954, Alston S. Householder published Princi￾ples of Numerical Analysis, 2 one of the first mod￾ern treatments of high-speed digital computa￾tion. Figure 2 reproduces a page from this book, also dealing with Gaussian elimination. The contrast between these two pages is strik￾ing. The most obvious difference is that Dwyer used scalar equations whereas Householder used partitioned matrices. But a deeper difference is that while Dwyer started from a system of equa￾tions, Householder worked with a (block) LU decomposition—the factorization of a matrix into the product of lower and upper triangular matrices. Generally speaking, a decomposition is a fac￾torization of a matrix into simpler factors. The underlying principle of the decompositional ap￾proach to matrix computation is that it is not the business of the matrix algorithmists to solve par￾ticular problems but to construct computational platforms from which a variety of problems can be solved. This approach, which was in full swing by the mid-1960s, has revolutionized ma￾trix computation. To illustrate the nature of the decompositional approach and its consequences, I begin with a discussion of the Cholesky decomposition and the solution of linear systems—essentially the decomposition that Gauss computed in his elim- 50 COMPUTING IN SCIENCE & ENGINEERING THE DECOMPOSITIONAL APPROACH TO MATRIX COMPUTATION The introduction of matrix decomposition into numerical linear algebra revolutionized matrix computations. This article outlines the decompositional approach, comments on its history, and surveys the six most widely used decompositions. T HEME A RTICLE G.W. STEWART University of Maryland 1521-9615/00/$10.00 © 2000 IEEE the Top

.6 rtdTephvo dingin +++= ) an4-(9-(e)} 阿4=- 器波兰 b可the rystem 2-2w ()-8 端 a10 )0-()- 兰滋出 Figure 1.This page from Linear Computations Figure 2.On this page from Principles of Numerical ey Acan be factored in the form A=RTR where Risa ria Adisclaimer is in order.This article deals pri- A marily with dense matrix computations.Al ation 2 can be used matrices,the ways in which it has affectedthem hen x is the solution of the triangular system are different from what I describe here. However,by definitionyisthesotionof The Cholesky decomposition and used the pr thee rian nsequently,we have re linear systems gorithm: ochto 3) Ar=b (1) Because triangular systems are easy to solve,the where Ais positive definite.It is well known that introduction of the Cholesky decomposition has ANUARY/FEBRUARY 2000 51

JANUARY/FEBRUARY 2000 51 ination algorithm. This article also provides a tour of the five other major matrix decomposi￾tions, including the pivoted LU decomposition, the QR decomposition, the spectral decompo￾sition, the Schur decomposition, and the singu￾lar value decomposition. A disclaimer is in order. This article deals pri￾marily with dense matrix computations. Al￾though the decompositional approach has greatly influenced iterative and direct methods for sparse matrices, the ways in which it has affected them are different from what I describe here. The Cholesky decomposition and linear systems We can use the decompositional approach to solve the system Ax = b (1) where A is positive definite. It is well known that A can be factored in the form A = RTR (2) where R is an upper triangular matrix. The fac￾torization is called the Cholesky decomposition of A. The factorization in Equation 2 can be used to solve linear systems as follows. If we write the system in the form RTRx = b and set y = R−Tb, then x is the solution of the triangular system Rx = y. However, by definition y is the solution of the system RTy = b. Consequently, we have re￾duced the problem to the solution of two trian￾gular systems, as illustrated in the following al￾gorithm: 1. Solve the system RTy = b. 2. Solve the system Rx = y. (3) Because triangular systems are easy to solve, the introduction of the Cholesky decomposition has Figure 1. This page from Linear Computations shows that Paul Dwyer’s approach begins with a system of scalar equations. Courtesy of John Wiley & Sons. Figure 2. On this page from Principles of Numerical Analysis, Alston Householder uses partitioned matrices and LU decomposition. Courtesy of Mc￾Graw-Hill

used to solve a system at one point in a compu tation,you can reuse the decomposition to do pute it solved the system in Equation I by reducing it elimination when a new right-hand side presents itself.A naive programmer is in danger of per from the tion is in the background can reuse it as needed (By the way,the problem of rec *s call to a single rou such as Matlab and Mathematica have the same problem- hese varleties of Gaussian eliminatior are an numer cally ch are proach to the consumers of matrixa rithms Another advantage of working with decompo d in n in more than one problem.For example,the fol- and Cholesky's algorithm in particular.Figure3 lowing algorithm solves the system A'x=: area contains partially processed element the boundary strips contain elements about to rocessed.Most of these variants w lly pres nted in sc (=(R)(RT)(5)volved.it is easy to see the essential unity of the we can compute pas follows: 1.Solve the system Ry=x all For ex c.the 2.0=vTy. (0 gorithm,in whatever guise,is backward stable the computed factor Rsatisfies ThedeompeoioaleramthCaolkga composition requires)op (A+E)=RTR rations to com (7 pute,whereas the solution of triangular systems where E is of the size of the rounding unit rela COMPUTING IN SCIENCE ENGINEERINC

52 COMPUTING IN SCIENCE & ENGINEERING transformed our problem into one for which the solution can be readily computed. We can use such decompositions to solve more than one problem. For example, the fol￾lowing algorithm solves the system ATx = b: 1. Solve the system Ry = b. 2. Solve the system RTx = y. (4) Again, in many statistical applications we want to compute the quantity ρ = xTA−1 x. Because xTA−1 x = xT(RTR) −1 x = (R−Tx) T(R−Tx) (5) we can compute ρ as follows: 1. Solve the system RTy = x. 2. ρ = yTy. (6) The decompositional approach can also save computation. For example, the Cholesky de￾composition requires O(n3 ) operations to com￾pute, whereas the solution of triangular systems requires only O(n2 ) operations. Thus, if you rec￾ognize that a Cholesky decomposition is being used to solve a system at one point in a compu￾tation, you can reuse the decomposition to do the same thing later without having to recom￾pute it. Historically, Gaussian elimination and its variants (including Cholesky’s algorithm) have solved the system in Equation 1 by reducing it to an equivalent triangular system. This mixes the computation of the decomposition with the solution of the first triangular system in Equa￾tion 3, and it is not obvious how to reuse the elimination when a new right-hand side presents itself. A naive programmer is in danger of per￾forming the reduction from the beginning, thus repeating the lion’s share of the work. On the other hand, a program that knows a decomposi￾tion is in the background can reuse it as needed. (By the way, the problem of recomputing de￾compositions has not gone away. Some matrix packages hide the fact that they repeatedly com￾pute a decomposition by providing drivers to solve linear systems with a call to a single rou￾tine. If the program calls the routine again with the same matrix, it recomputes the decomposi￾tion—unnecessarily. Interpretive matrix systems such as Matlab and Mathematica have the same problemthey hide decompositions behind op￾erators and function calls. Such are the conse￾quences of not stressing the decompositional ap￾proach to the consumers of matrix algorithms.) Another advantage of working with decompo￾sitions is unity. There are different ways of or￾ganizing the operations involved in solving lin￾ear systems by Gaussian elimination in general and Cholesky’s algorithm in particular. Figure 3 illustrates some of these arrangements: a white area contains elements from the original matrix, a dark area contains the factors, a light gray area contains partially processed elements, and the boundary strips contain elements about to be processed. Most of these variants were origi￾nally presented in scalar form as new algorithms. Once you recognize that a decomposition is in￾volved, it is easy to see the essential unity of the various algorithms. All the variants in Figure 3 are numerically equivalent. This means that one rounding-error analysis serves all. For example, the Cholesky al￾gorithm, in whatever guise, is backward stable: the computed factor R satisfies (A + E) = RTR (7) where E is of the size of the rounding unit rela￾tive to A. Establishing this backward is usually the most difficult part of an analysis of the use Figure 3. These varieties of Gaussian elimination are all numerically equivalent

of a decor rounding errors involved in the solutions of the the decompoonal elative ease.I hus,ano but many problems e the of original matrix. In general, The decompositional approach often shows that apparently Cholest dec compute the new decomposition directly from Many matrix decompositions can be updated,sometimes with great savings in computatio s instead of a host o position of from that of a in ou op erations,an enormous savings over the initio produce highly effective matrix packages the reduction of a quadratic form x)=xA iety of specific applica con nac m most important applications.Thisappr oach has )=++g+ =p(x)+p防(x)+K+p(x) 1-6A wheren is the ith row of R.Thus Gauss re- found that most algorithms have broad compu- duced )toasum of squares of linear functions nal features in common s than car P Becaus Ris upper triangular,the function are the elements of RGauss,by showing how decomposionalaproaCh elacobi in introduced in othe roduced the LU decomposition as a decomposition of a bi- linear form into a sum of products of linear fund duced the decomp osition that now hears his osition made its an earance as an ort name.However,with the exception of the Schur nal change of variables that diagonalizeda bilin decomposition,they were the lar form.Eventually,all these decomposition some historical background for the individual so important to matrix computations was slow nd incremental.Gauss certainly had the spirit d in the He used used it to updatee uares solutions but tems defined by the normal equations for least Gauss never regarded his decomposition as a squares,described his elimination procedure as matrix factorization,and it would be anachro- ANUARY/FEBRUARY 2000

JANUARY/FEBRUARY 2000 53 of a decomposition to solve a problem. For ex￾ample, once Equation 7 has been established, the rounding errors involved in the solutions of the triangular systems in Equation 3 can be incor￾porated in E with relative ease. Thus, another advantage of the decompositional approach is that it concentrates the most difficult aspects of rounding-error analysis in one place. In general, if you change the elements of a positive definite matrix, you must recompute its Cholesky decomposition from scratch. However, if the change is structured, it may be possible to compute the new decomposition directly from the old—a process known as updating. For ex￾ample, you can compute the Cholesky decom￾position of A + xxT from that of A in O(n2 ) op￾erations, an enormous savings over the ab initio computation of the decomposition. Finally, the decompositional approach has greatly affected the development of software for matrix computation. Instead of wasting energy developing code for a variety of specific applica￾tions, the producers of a matrix package can con￾centrate on the decompositions themselves, per￾haps with a few auxiliary routines to handle the most important applications. This approach has informed the major public-domain packages: the Handbook series,3 Eispack, 4 Linpack, 5 and La￾pack.6 A consequence of this emphasis on de￾compositions is that software developers have found that most algorithms have broad compu￾tational features in common—features than can be relegated to basic linear-algebra subprograms (such as Blas), which can then be optimized for specific machines.7−9 For easy reference, the sidebar “Benefits of the decompositional approach” summarizes the ad￾vantages of decomposition. History All the widely used decompositions had made their appearance by 1909, when Schur intro￾duced the decomposition that now bears his name. However, with the exception of the Schur decomposition, they were not cast in the lan￾guage of matrices (in spite of the fact that matri￾ces had been introduced in 185810). I provide some historical background for the individual decompositions later, but it is instructive here to consider how the originators proceeded in the absence of matrices. Gauss, who worked with positive definite sys￾tems defined by the normal equations for least squares, described his elimination procedure as the reduction of a quadratic form ϕ(x) = xTAx (I am simplifying a little here). In terms of the Cholesky factorization A = RTR, Gauss wrote ϕ(x) in the form (8) where is the ith row of R. Thus Gauss re￾duced ϕ(x) to a sum of squares of linear functions ρi. Because R is upper triangular, the function ρi(x) depends only on the components xi,…xn of x. Since the coefficients in the linear forms ρi are the elements of R, Gauss, by showing how to compute the ρi, effectively computed the Cholesky decomposition of A. Other decompositions were introduced in other ways. For example, Jacobi introduced the LU decomposition as a decomposition of a bi￾linear form into a sum of products of linear func￾tions having an appropriate triangularity with respect to the variables. The singular value de￾composition made its appearance as an orthogo￾nal change of variables that diagonalized a bilin￾ear form. Eventually, all these decompositions found expressions as factorizations of matrices.11 The process by which decomposition became so important to matrix computations was slow and incremental. Gauss certainly had the spirit. He used his decomposition to perform many tasks, such as computing variances, and even used it to update least-squares solutions. But Gauss never regarded his decomposition as a matrix factorization, and it would be anachro￾ri T ϕ ρ ρ ρ ( ) ( ) ( ) ( ) x r x r x r x x x x n n = ( ) + ( ) + + ( ) ≡ + + + 1 2 2 2 2 1 2 2 2 2 T T T K K Benefits of the decompositional approach • A matrix decomposition solves not one but many problems. • A matrix decomposition, which is generally expensive to compute, can be reused to solve new problems involving the original matrix. • The decompositional approach often shows that apparently different algorithms are actually computing the same object. • The decompositional approach facilitates rounding-error analysis. • Many matrix decompositions can be updated, sometimes with great savings in computation. • By focusing on a few decompositions instead of a host of specific problems, software developers have been able to produce highly effective matrix packages

nistic to consider him the father of the decom- algorithms that compute them have a satisfac tory backward rounding-error analysis (see ms for s ences H.H.Goldstine.in their ground-breaking error merical linear algebra, analysis of the solution oflinear systems,pointed ries,or the LINPACK Users'Guide. ng the a positive definite matrix We may therefore interpret the elimination er triangular matrix R ses the inverting of an x A on the of tw A=RTR In this form,the decomposition is known as th verses by a simple,e ctive proces he6rrdcomposiontsoienwmlians In the 1950s and early 1960s,Householder A=LDLT y explored the relation bet ms in m terms. 6 th mathe mpositi al) Applications The Cholesky decomposition is red h to reduce a sym orm by or ms, qu ons and 6.1 a way station to the computation of the eia tics as in Equation 4 values of A,and at the time no one thought of it Algorithms.A Cholesky decomposition can riants of Gauss terme to take aduanta y All the . comp ach algorithms take ap ima evn/6f oating e the first back nt add ons and multiplications.The alg ward rounding the so y prop respon s to the complete.He gives s one analysis of the comp Updating.Given a Cholesk tation of the LU decomposition and another of A=RTR,you can calculate the Cholesky d rom R a analyzing nd x in O(n Cholmg-po nt a tions.introducing uniform techniques for deal culated with the same number of operations. ing with the transformations used in the com The latter process,which is called downdating,is time hi boo updating stablished nd Aip nite,then PTAPis said to be a diagonal permu tation of A (an The big six ong other thing,it permutes the agonals o Any diagona old Such a factorization is called a pivoted daily.Nonetheless,six decompositions hold the factorization.There are many ways to pivot a e useful and -they have important applications and one pro COMPUTING IN SCIENCE ENGINEERING

54 COMPUTING IN SCIENCE & ENGINEERING nistic to consider him the father of the decom￾positional approach. In the 1940s, awareness grew that the usual al￾gorithms for solving linear systems involved ma￾trix factorization.12,13 John Von Neumann and H.H. Goldstine, in their ground-breaking error analysis of the solution of linear systems, pointed out the division of labor between computing a factorization and solving the system:14 We may therefore interpret the elimination method as one which bases the inverting of an arbitrary matrix A on the combination of two tricks: First it decomposes A into the product of two semi-diagonal matrices C, B′ …, and conse￾quently the inverse of A obtains immediately from those of C and B′. Second it forms their in￾verses by a simple, explicit, inductive process. In the 1950s and early 1960s, Householder systematically explored the relation between var￾ious algorithms in matrix terms. His book The Theory of Matrices in Numerical Analysis is the mathematical epitome of the decompositional approach.15 In 1954, Givens showed how to reduce a sym￾metric matrix A to tridiagonal form by orthogo￾nal transformation.16 The reduction was merely a way station to the computation of the eigen￾values of A, and at the time no one thought of it as a decomposition. However, it and other in￾termediate forms have proven useful in their own right and have become a staple of the de￾compositional approach. In 1961, James Wilkinson gave the first back￾ward rounding-error analysis of the solutions of linear systems.17 Here, the division of labor is complete. He gives one analysis of the compu￾tation of the LU decomposition and another of the solution of triangular systems and then com￾bines the two. Wilkinson continued analyzing various algorithms for computing decomposi￾tions, introducing uniform techniques for deal￾ing with the transformations used in the com￾putations. By the time his book Algebraic Eigenvalue Problem18 appeared in 1965, the de￾compositional approach was firmly established. The big six There are many matrix decompositions, old and new, and the list of the latter seems to grow daily. Nonetheless, six decompositions hold the center. The reason is that they are useful and sta￾ble—they have important applications and the algorithms that compute them have a satisfac￾tory backward rounding-error analysis (see Equation 7). In this brief tour, I provide refer￾ences only for details that cannot be found in the many excellent texts and monographs on nu￾merical linear algebra,18−26 the Handbook se￾ries,3 or the LINPACK Users’ Guide. 5 The Cholesky decomposition Description. Given a positive definite matrix A, there is a unique upper triangular matrix R with positive diagonal elements such that A = RTR. In this form, the decomposition is known as the Cholesky decomposition. It is often written in the form A = LDLT where D is diagonal and L is unit lower triangu￾lar (that is, L is lower triangular with ones on the diagonal). Applications. The Cholesky decomposition is used primarily to solve positive definite linear systems, as in Equations 3 and 6. It can also be employed to compute quantities useful in statis￾tics, as in Equation 4. Algorithms. A Cholesky decomposition can be computed using any of the variants of Gauss￾ian elimination (see Figure 3)modified, of course, to take advantage of symmetry. All these algorithms take approximately n3 /6 floating￾point additions and multiplications. The algo￾rithm Cholesky proposed corresponds to the di￾agram in the lower right of Figure 3. Updating. Given a Cholesky decomposition A = RTR, you can calculate the Cholesky de￾composition of A + xxT from R and x in O(n2 ) floating-point additions and multiplications. The Cholesky decomposition of A − xxT can be cal￾culated with the same number of operations. The latter process, which is called downdating, is numerically less stable than updating. The pivoted Cholesky decomposition. If P is a permutation matrix and A is positive defi￾nite, then PTAP is said to be a diagonal permu￾tation of A (among other things, it permutes the diagonals of A). Any diagonal permutation of A is positive definite and has a Cholesky factor. Such a factorization is called a pivoted Cholesky factorization. There are many ways to pivot a Cholesky decomposition, but the most common one produces a factor satisfying

the of the (9)certain conditions a bilinear form y)can be written in the form where p;and o,are linear functions that depend only on th st( of I and I decomposition is widely used for rank determi- nation The QR decomposition Let an orthogo which he sketched in 18027 and presented in full in 1810.2 QA=8) ant po where ri ith ive di- The pivoted LU decomposition there are Description.Given ifAis of rankn). 1 Osuch that If we partition Qin the form PTAQ-LU 0=(0a0 tri ngu rand Uis upper where O has a columns,then we can write are e proce A-OR (10 lications.Like the Cholesky decomposi- This is sometimes called the QR factorization of n is use prima wever, ications.When A is of rank the s of o.f range For the LU decomposition is the column sp ce R(A)of A,and the columns of form an orthonormal basis of the ortho with the inverse power method,to complement of R(A).In parti Algorithms.The basic algorithm for the or d videly used in applications with a geometric favor,especially s.There are tw the ent for s cial matrices (such as nositive def on:Gram-Schmidt algorithms inite and diagonally dominant matrices),the triangularization. requires some forr for sta Gram-Schmidt algorithms proceec tepwis in which p get the th column to be eliminated.This algorithm re column of There are two forms of the quires about n/3 additions chmidt algorithm,the classical and the tain contrive ow that Gau an they bo stable.Nonetheless.it works well for the o Gram-Schmidt is unstable.The modified form 03 an produce a matrix Qa whose columns deviate from orthogonality.But the deviation is JANUARY/FEBRUARY 2000

JANUARY/FEBRUARY 2000 55 (9) In particular, if A is positive semidefinite, this strategy will assure that R has the form where R11 is nonsingular and has the same order as the rank of A. Hence, the pivoted Cholesky decomposition is widely used for rank determi￾nation. History. The Cholesky decomposition (more precisely, an LDLT decomposition) was the de￾composition of Gauss’s elimination algorithm, which he sketched in 180927 and presented in full in 1810.28 Benoît published Cholesky’s vari￾ant posthumously in 1924.29 The pivoted LU decomposition Description. Given a matrix A of order n, there are permutations P and Q such that PTAQ = LU where L is unit lower triangular and U is upper triangular. The matrices P and Q are not unique, and the process of selecting them is known as pivoting. Applications. Like the Cholesky decomposi￾tion, the LU decomposition is used primarily for solving linear systems. However, since A is a general matrix, this application covers a wide range. For example, the LU decomposition is used to compute the steady-state vector of Markov chains and, with the inverse power method, to compute eigenvectors. Algorithms. The basic algorithm for com￾puting LU decompositions is a generalization of Gaussian elimination to nonsymmetric matrices. When and how this generalization arose is ob￾scure (see Dwyer1 for comments and references). Except for special matrices (such as positive def￾inite and diagonally dominant matrices), the method requires some form of pivoting for sta￾bility. The most common form is partial pivot￾ing, in which pivot elements are chosen from the column to be eliminated. This algorithm re￾quires about n3 /3 additions and multiplications. Certain contrived examples show that Gauss￾ian elimination with partial pivoting can be un￾stable. Nonetheless, it works well for the over￾whelming majority of real-life problems.30,31 Why is an open question. History. In establishing the existence of the LU decomposition, Jacobi showed32 that under certain conditions a bilinear form ϕ(x, y) can be written in the form ϕ(x, y) = ρ1(x)σ1(y) + ρ2(x)σ2(y) + … + ρn(x)σn(y) where ρi and σi are linear functions that depend only on the last (n – i + 1) components of their arguments. The coefficients of the functions are the elements of L and U. The QR decomposition Description. Let A be an m × n matrix with m ≥ n. There is an orthogonal matrix Q such that where R is upper triangular with nonnegative di￾agonal elements (or positive diagonal elements if A is of rank n). If we partition Q in the form Q = (QA Q⊥) where QA has n columns, then we can write A = QAR. (10) This is sometimes called the QR factorization of A. Applications. When A is of rank n, the columns of QA form an orthonormal basis for the column space R(A) of A, and the columns of Q⊥ form an orthonormal basis of the orthogo￾nal complement of R(A). In particular, is the orthogonal projection onto R(A). For this reason, the QR decomposition is widely used in applications with a geometric flavor, especially least squares. Algorithms. There are two distinct classes of algorithms for computing the QR decomposi￾tion: Gram–Schmidt algorithms and orthogonal triangularization. Gram–Schmidt algorithms proceed stepwise by orthogonalizing the kth columns of A against the first (k − 1) columns of Q to get the kth column of Q. There are two forms of the Gram–Schmidt algorithm, the classical and the modified, and they both compute only the factorization in Equation 10. The classical Gram–Schmidt is unstable. The modified form can produce a matrix QA whose columns deviate from orthogonality. But the deviation is QAQA T Q AT R =       0 R =       R11 R12 0 0 r r kk ij j k i k 2 2 ≥ > ≥ ∑∑

bounded,and the computed factorization can be oted QR factorization has the form used in certain applications- -notably comput tions.If the orthogonaliza t each stag AP- a proces will produce a fully orthogonal factorization. It follows that either or the first columns of additions and mul AP form a basis for the column space of A hus he pivoted QR compos nal triangularizatio colun ans from A. g A by certain simple History.The QR factorization first appeared rthogonal matrices until the elements below in a work by Erhard Schmidt on integral equa the diagonal are zer I he ons ,Schmidt showed how owknown as the Gram-Schmidt algorithm ons.The first reduces the matrix by Hous formations.I he method has the of orthogo ality.)Th th age titrep the Q hold A.a great savings algorithm,discussed later).Houscholder intr method r uces A by plane rotations.It is less duced Householder transf mations to m method omp ons an showed h they c Relation to the Cholesky decomposition. them to reduce a sy metric matrix to tridiage From Equation 10,it follows that nal for Bogert and Burris appear to be the first AA-RR due to Golub.7who also introduced the idea of In other words,the triangular factor of the QR pivoting. composition of A the tri angular factor of The netric matrix of particularly least-square problems can be der There is an orthogonal matrix Vsuch soved using either the Cho that on.The OR de A=VA VT A diag( (12) rate results whereas the Cholesky decomposition is often If;denotes the ith column of V,then Av;= QR fa puting the QR factorization after rows and mal system of eigenvectors. Applications.The spectral decomposition In additio e Q of finds application here bly s to say in The pivoted QR decomposition.If Pis a Algorithms.There are three classes of algo rix,then APis A 15a agon the QR algor quer a CATA the OR and the Cholesky decomt ns it is require a preliminary reduction to tridia not surprising that there is a pivoted QR factor- form by orthogonal similarities.I discuss the QR actor Rsatishes Equa algorithm in th composition. conquer algo COMPUTING IN SCIENCE ENGINEERING

56 COMPUTING IN SCIENCE & ENGINEERING bounded, and the computed factorization can be used in certain applications—notably comput￾ing least-squares solutions. If the orthogonaliza￾tion step is repeated at each stage—a process known as reorthogonalization—both algorithms will produce a fully orthogonal factorization. When n » m, the algorithms without reorthogo￾nalization require about mn2 additions and mul￾tiplications. The method of orthogonal triangularization proceeds by premultiplying A by certain simple orthogonal matrices until the elements below the diagonal are zero. The product of the or￾thogonal matrices is Q, and R is the upper trian￾gular part of the reduced A. Again, there are two versions. The first reduces the matrix by House￾holder transformations. The method has the ad￾vantage that it represents the entire matrix Q in the same amount of memory that is required to hold A, a great savings when n » p. The second method reduces A by plane rotations. It is less efficient than the first method, but is better suited for matrices with structured patterns of nonzero elements. Relation to the Cholesky decomposition. From Equation 10, it follows that ATA = RTR. (11) In other words, the triangular factor of the QR decomposition of A is the triangular factor of the Cholesky decomposition of the cross-product matrix ATA. Consequently, many problems— particularly least-squares problems—can be solved using either a QR decomposition from a least-squares matrix or the Cholesky decompo￾sition from the normal equation. The QR de￾composition usually gives more accurate results, whereas the Cholesky decomposition is often faster. Updating. Given a QR factorization of A, there are stable, efficient algorithms for recom￾puting the QR factorization after rows and columns have been added to or removed from A. In addition, the QR decomposition of the rank-one modification A + xyT can be stably updated. The pivoted QR decomposition. If P is a permutation matrix, then AP is a permutation of the columns of A, and (AP) T(AP) is a diagonal permutation of ATA. In view of the relation of the QR and the Cholesky decompositions, it is not surprising that there is a pivoted QR factor￾ization whose triangular factor R satisfies Equa￾tion 9. In particular, if A has rank k, then its piv￾oted QR factorization has the form . It follows that either Q1 or the first k columns of AP form a basis for the column space of A. Thus, the pivoted QR decomposition can be used to extract a set of linearly independent columns from A. History. The QR factorization first appeared in a work by Erhard Schmidt on integral equa￾tions.33 Specifically, Schmidt showed how to or￾thogonalize a sequence of functions by what is now known as the Gram–Schmidt algorithm. (Curiously, Laplace produced the basic formu￾las34 but had no notion of orthogonality.) The name QR comes from the QR algorithm, named by Francis (see the history notes for the Schur algorithm, discussed later). Householder intro￾duced Householder transformations to matrix computations and showed how they could be used to triangularize a general matrix.35 Plane rotations were introduced by Givens,16 who used them to reduce a symmetric matrix to tridiago￾nal form. Bogert and Burris appear to be the first to use them in orthogonal triangularization.36 The first updating algorithm (adding a row) is due to Golub,37 who also introduced the idea of pivoting. The spectral decomposition Description. Let A be a symmetric matrix of order n. There is an orthogonal matrix V such that A = V Λ VT, Λ = diag(λ1, …,λn). (12) If vi denotes the ith column of V, then Avi = λi vi . Thus (λi , vi ) is an eigenpair of A, and the spectral decomposition shown in Equation 12 exhibits the eigenvalues of A along with complete orthonor￾mal system of eigenvectors. Applications. The spectral decomposition finds applications wherever the eigensystem of a symmetric matrix is needed, which is to say in virtually all technical disciplines. Algorithms. There are three classes of algo￾rithms to compute the spectral decomposition: the QR algorithm, the divide-and-conquer al￾gorithm, and Jacobi’s algorithm. The first two require a preliminary reduction to tridiagonal form by orthogonal similarities. I discuss the QR algorithm in the next section on the Schur de￾composition. The divide-and-conquer algo￾AP =      (Q Q )  R R 1 2 11 12 0 0

rithm38,39 is HebgomohheEom which is usually done with values and eigenvector rs are desired;however,it computed using the QRalgorithm.Elsewhere is not suitabl his issue, Beresford Parlett discusses m.It is one variants for the spectral decomt osition the sin accurate.40 All these algorithms require O(n) gular values decomposition,and the generalized oper problem his d in 1909 is the only o positions,the algorithm does not result in a re have been derived in terms of matrices.It was -it still remains largely ignored until Francis's QR algorithm nstant Is low pushe t into the limelight The singular value decomposition duced the eigenvectors as solutions of equation of the form Ar= e orthogonal b,8ing such tha rithm for spectral de position,which iter- type where al form ∑=diag(o, 022 .之6.20 Givens to Householder. The Schur iption.Let A be a matrix of order n. There is a unitary matrix Usuch that A=UaΣ (13) A=UTU which is mes called the singular value fac of A where Tis upper triangular and H meansc 0n- The diagonal elements of oare called thei jugate transpose.The eigen elements of Tare hich,by appropr This de omposition is called a S led by th tion of A. singular value decomposition.In addition,the real matrix eigenvalu singular valu hence 2 blocks tain its complex eigenvalues.the entire decom low-rank ap roimations and to compute nge position can be made real.This is sometimes between subspaces. Relation to the spectral de mposition.The A in cdiate for hich the tral ion in tuuch the eigenvalues and eigen tors of a matrix can be OR factorization is related to the Cholesky de computed.On th other hand,the Schur de in pla e or a com exist.A nle is th ATA=∑2I equation and its relatives. Algorithms After a preliminary reduction to Thus the eigenvalues of the cross-product ma 57

JANUARY/FEBRUARY 2000 57 rithm38,39 is comparatively recent and is usually faster than the QR algorithm when both eigen￾values and eigenvectors are desired; however, it is not suitable for problems in which the eigen￾values vary widely in magnitude. The Jacobi al￾gorithm is much slower than the other two, but for positive definite matrices it may be more accurate.40 All these algorithms require O(n3 ) operations. Updating. The spectral decomposition can be updated. Unlike the Cholesky and QR decom￾positions, the algorithm does not result in a re￾duction in the order of the work—it still remains O(n3 ), although the order constant is lower. History. The spectral decomposition dates back to an 1829 paper by Cauchy, 41 who intro￾duced the eigenvectors as solutions of equations of the form Ax = λx and proved the orthogonal￾ity of eigenvectors belonging to distinct eigen￾values. In 1846, Jacobi42 gave his famous algo￾rithm for spectral decomposition, which iter￾atively reduces the matrix in question to diago￾nal form by a special type of plane rotations, now called Jacobi rotations. The reduction to tridiagonal form by plane rotations is due to Givens16 and by Householder transformations to Householder. 43 The Schur decomposition Description. Let A be a matrix of order n. There is a unitary matrix U such that A = UTUH where T is upper triangular and H means con￾jugate transpose. The diagonal elements of T are the eigenvalues of A, which, by appropriate choice of U, can be made to appear in any order. This decomposition is called a Schur decomposi￾tion of A. A real matrix can have complex eigenvalues and hence a complex Schur form. By allowing T to have real 2 × 2 blocks on its diagonal that con￾tain its complex eigenvalues, the entire decom￾position can be made real. This is sometimes called a real Schur form. Applications. An important use of the Schur form is as an intermediate form from which the eigenvalues and eigenvectors of a matrix can be computed. On the other hand, the Schur de￾composition can often be used in place of a com￾plete system of eigenpairs, which, in fact, may not exist. A good example is the solution of Sylvester’s equation and its relatives.44,45 Algorithms. After a preliminary reduction to Hessenberg form, which is usually done with Householder transformations, the Schur from is computed using the QR algorithm.46 Elsewhere in this issue, Beresford Parlett discusses the modern form of the algorithm. It is one of the most flexible algorithms in the repertoire, having variants for the spectral decomposition, the sin￾gular values decomposition, and the generalized eigenvalue problem. History. Schur introduced his decomposition in 1909.47 It was the only one of the big six to have been derived in terms of matrices. It was largely ignored until Francis’s QR algorithm pushed it into the limelight. The singular value decomposition Description. Let A be an m × n matrix with m ≥ n. There are orthogonal matrices U and V such that where Σ = diag(σ1, …, σn), σ1 ≥ σ2 ≥ … ≥ σn ≥ 0. This decomposition is called the singular value decomposition of A. If UA consists of the first n columns of U, we can write A = UA Σ VT (13) which is sometimes called the singular value fac￾torization of A. The diagonal elements of σ are called the singu￾lar values of A. The corresponding columns of U and V are called left and right singular vectors of A. Applications. Most of the applications of the QR decomposition can also be handled by the singular value decomposition. In addition, the singular value decomposition gives a basis for the row space of A and is more reliable in determin￾ing rank. It can also be used to compute optimal low-rank approximations and to compute angles between subspaces. Relation to the spectral decomposition. The singular value factorization is related to the spec￾tral decomposition in much the same way as the QR factorization is related to the Cholesky de￾composition. Specifically, from Equation 13 it follows that ATA = VΣ2 VT. Thus the eigenvalues of the cross-product ma￾U AV T =       Σ 0

trix ATA are the squares of the singular vectors of A and the eigenvectors of ATA are right sin- 1 tion computing the singular value decomposition:the two 1.6 New Yark than the QR algorithm,and the Jacobi algorithm is the slowest 12.P.. compositi 738 and lordan in 18.The reduction to bidias 19 The first Jacobi-like more.Asm intermediate forms- -such as tridi agonal and Hessenberg forms pensive to compute and not readily updated. rank-revealingivshave here are also genera 21.Iw.7 n a the 2. All crystal balls become cloudy when ner troinsa」 ms a 24.60k197 tion to Matrix Computations,Academic 2五AW5bg,6gtAbaihmstatDcomposos,SAM Acknowledgment 26.D5Watsibgnaba%CampatasnWigy& 27.C.F.Gauss,Theoria Motus Corporum Coelestium in Sectionibu References ,Vol. 3" ni d on a (ous Vol..9.pp. 58 COMPUTING IN SCIENCE ENGINEERING

58 COMPUTING IN SCIENCE & ENGINEERING trix ATA are the squares of the singular vectors of A and the eigenvectors of ATA are right sin￾gular vectors of A. Algorithms. As with the spectral decomposi￾tion, there are three classes of algorithms for computing the singular value decomposition: the QR algorithm, a divide-and-conquer algorithm, and a Jacobi-like algorithm. The first two re￾quire a reduction of A to bidiagonal form. The divide-and-conquer algorithm49 is often faster than the QR algorithm, and the Jacobi algorithm is the slowest. History. The singular value decomposition was introduced independently by Beltrami in 187350 and Jordan in 1874.51 The reduction to bidiago￾nal form is due to Golub and Kahan,52 as is the variant of the QR algorithm. The first Jacobi-like algorithm for computing the singular value de￾composition was given by Kogbetliantz.53 The big six are not the only decompo￾sitions in use; in fact, there are many more. As mentioned earlier, certain intermediate forms—such as tridi￾agonal and Hessenberg forms—have come to be regarded as decompositions in their own right. Since the singular value decomposition is ex￾pensive to compute and not readily updated, rank-revealing alternatives have received con￾siderable attention.54,55 There are also general￾izations of the singular value decomposition and the Schur decomposition for pairs of matri￾ces.56,57 All crystal balls become cloudy when they look to the future, but it seems safe to say that as long as new matrix problems arise, new decompositions will be devised to solve them. Acknowledgment This work was supported by the National Science Foundation under Grant No. 970909-8562. References 1. P.S. Dwyer, Linear Computations, John Wiley & Sons, New York, 1951. 2. A.S. Householder, Principles of Numerical Analysis, McGraw-Hill, New York, 1953. 3. J.H. Wilkinson and C. Reinsch, Handbook for Automatic Compu￾tation, Vol. II, Linear Algebra, Springer-Verlag, New York, 1971. 4. B.S. Garbow et al., “Matrix Eigensystem RoutinesEispack Guide Extension,” Lecture Notes in Computer Science, Springer-Verlag, New York, 1977. 5. J.J. Dongarra et al., LINPACK User’s Guide, SIAM, Philadelphia, 1979. 6. E. Anderson et al., LAPACK Users’ Guide, second ed., SIAM, Philadelphia, 1995. 7. J.J. Dongarra et al., “An Extended Set of Fortran Basic Linear Al￾gebra Subprograms,” ACM Trans. Mathematical Software, Vol. 14, 1988, pp. 1–17. 8. J.J. Dongarra et al., “A Set of Level 3 Basic Linear Algebra Sub￾programs,” ACM Trans. Mathematical Software, Vol. 16, 1990, pp. 1–17. 9. C.L. Lawson et al., “Basic Linear Algebra Subprograms for For￾tran Usage,” ACM Trans. Mathematical Software, Vol. 5, 1979, pp. 308–323. 10. A. Cayley, “A Memoir on the Theory of Matrices,” Philosophical Trans. Royal Soc. of London, Vol. 148, 1858, pp. 17–37. 11. C.C. Mac Duffee, The Theory of Matrices, Chelsea, New York, 1946. 12. P.S. Dwyer, “A Matrix Presentation of Least Squares and Corre￾lation Theory with Matrix Justification of Improved Methods of Solution,” Annals Math. Statistics, Vol. 15, 1944, pp. 82–89. 13. H. Jensen, An Attempt at a Systematic Classification of Some Meth￾ods for the Solution of Normal Equations, Report No. 18, Geo￾dætisk Institut, Copenhagen, 1944. 14. J. von Neumann and H.H. Goldstine, “Numerical Inverting of Matrices of High Order,” Bull. Am. Math. Soc., Vol. 53, 1947, pp. 1021–1099. 15. A.S. Householder, The Theory of Matrices in Numerical Analysis, Dover, New York, 1964. 16. W. Givens, Numerical Computation of the Characteristic Values of a Real Matrix, Tech. Report 1574, Oak Ridge Nat’l Laboratory, Oak Ridge, Tenn., 1954. 17. J.H. Wilkinson, “Error Analysis of Direct Methods of Matrix In￾version,” J. ACM, Vol. 8, 1961, pp. 281–330. 18. J.H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, U.K., 1965. 19. Å. Björck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia, 1996. 20. B.N. Datta, Numerical Linear Algebra and Applications, Brooks/ Cole, Pacific Grove, Calif., 1995. 21. J.W. Demmel, Applied Numerical Linear Algebra. SIAM, Philadel￾phia, 1997. 22. N.J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, 1996. 23. B.N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, En￾glewood Cliffs, N.J., 1980; reissued with revisions by SIAM, Philadelphia, 1998. 24. G.W. Stewart, Introduction to Matrix Computations, Academic Press, New York, 1973. 25. G.W. Stewart, Matrix Algorithms I: Basic Decompositions, SIAM, Philadelphia, 1998. 26. D.S. Watkins, Fundamentals of Matrix Computations, John Wiley & Sons, New York, 1991. 27. C.F. Gauss, Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium [Theory of the Motion of the Heavenly Bodies Moving about the Sun in Conic Sections], Perthes and Besser, Hamburg, Germany, 1809. 28. C.F. Gauss, “Disquisitio de elementis ellipticis Palladis [Disquisi￾tion on the Elliptical Elements of Pallas],” Commentatines soci￾etatis regiae scientarium Gottingensis recentiores, Vol. 1, 1810. 29. C. Benoît, “Note sur une méthode de résolution des équations normales provenant de l’application de la méthode des moin￾dres cárres a un systém d’équations linéares en nombre inférieur ’a celui des inconnues. — application de la méthode á la resolu￾tion d’un systéme défini d’équations linéares (Procédé du Com￾mandant Cholesky) [Note on a Method for Solving the Normal Equations Arising from the Application of the Method of Least Squares to a System of Linear Equations whose Number Is Less than the Number of Unknowns—Application of the Method to the Solution of a Definite System of Linear Equations],” Bullétin Géodésique (Toulouse), Vol. 2, 1924, pp. 5–77

nTethRpamtO-37,DeptocfCompwiarSdcne,Unk.afMi sota, 310 53.E.G.K ntegral ,Pp.1464-46 2 1973,p241-256. W151991266-126 G.W.Stewart is a professor in the Department of lle on d the u research interests ocus on numerical linear algebra 1829 perturbation theory,and rounding-error analysis.He nis PhD from the n die in der The ersity of Te of Man the N Scientific Programming 47 I.Schu eaaneneCereheoeernegalge will return in the next issue of CiSE. nslation by D.Boley is a IANUARY/FEBRUARY 2000

JANUARY/FEBRUARY 2000 59 30. L.N. Trefethen and R.S. Schreiber, “Average-Case Stability of Gaussian Elimination,” SIAM J. Matrix Analysis and Applications, Vol. 11, 1990, pp. 335–360. 31. L.N. Trefethen and D. Bau III, Numerical Linear Algebra, SIAM, Philadelphia, 1997, pp. 166–170. 32. C.G.J. Jacobi, “Über einen algebraischen Fundamentalsatz und seine Anwendungen [On a Fundamental Theorem in Algebra and Its Applications],” Journal für die reine und angewandte Math￾ematik, Vol. 53, 1857, pp. 275–280. 33. E. Schmidt, “Zur Theorie der linearen und nichtlinearen Inte￾gralgleichungen. I Teil, Entwicklung willkürlichen Funktionen nach System vorgeschriebener [On the Theory of Linear and Nonlinear Integral Equations. Part I, Expansion of Arbitrary Func￾tions by a Prescribed System],” Mathematische Annalen, Vol. 63, 1907, pp. 433–476. 34. P.S. Laplace, Theorie Analytique des Probabilites [Analytical Theory of Probabilities], 3rd ed., Courcier, Paris, 1820. 35. A.S. Householder, “Unitary Triangularization of a Nonsymmet￾ric Matrix,” J. ACM, Vol. 5, 1958, pp. 339–342. 36. D. Bogert and W.R. Burris, Comparison of Least Squares Algo￾rithms, Tech. Report ORNL-3499, V.1, S5.5, Neutron Physics Div., Oak Ridge Nat’l Laboratory, Oak Ridge, Tenn., 1963. 37. G.H. Golub, “Numerical Methods for Solving Least Squares Prob￾lems,” Numerische Mathematik, Vol. 7, 1965, pp. 206–216. 38. J.J.M. Cuppen, “A Divide and Conquer Method for the Sym￾metric Eigenproblem,” Numerische Mathematik, Vol. 36, 1981, pp. 177–195. 39. M. Gu and S.C. Eisenstat, “Stable and Efficient Algorithm for the Rank-One Modification of the Symmetric Eigenproblem,” SIAM J. Matrix Analysis and Applications, Vol. 15, 1994, pp. 1266–1276. 40. J. Demmel and K. Veselic, “Jacobi’s Method Is More Accurate than QR,” SIAM J. Matrix Analysis and Applications, Vol. 13, 1992, pp. 1204–1245. 41. A.L. Cauchy, “Sur l’equation á l’aide de laquelle on détermine les inégalites séculaires des mouvements des planétes [On the Equation by which the Inequalities for the Secular Movement of Planets Is Determined],” Oeuvres Completes (II, e Séie), Vol. 9, 1829. 42. C.G.J. Jacobi, “Über ein leichtes Verfahren die in der Theorie der Säculärstörungen vorkommenden Gleichungen numerisch aufzulösen [On an Easy Method for the Numerical Solution of the Equations that Appear in the Theory of Secular Perturba￾tions],” Journal für die reine und angewandte Mathematik, Vol. 30, 1846, pp. 51–s94. 43. J.H. Wilkinson, “Householder’s Method for the Solution of the Algebraic Eigenvalue Problem,” Computer J., Vol. 3, 1960, pp. 23–27. 44. R.H. Bartels and G.W. Stewart, “Algorithm 432: The Solution of the Matrix Equation AX − BX = C,” Comm. ACM, Vol. 8, 1972, pp. 820–826. 45. G.H. Golub, S. Nash, and C. Van Loan, “Hessenberg–Schur Method for the Problem AX + XB = C,” IEEE Trans. Automatic Con￾trol, Vol. AC-24, 1979, pp. 909–913. 46. J.G.F. Francis, “The QR Transformation, Parts I and II,” Computer J., Vol. 4, 1961, pp. 265–271, 1962, pp. 332–345. 47. J. Schur, “Über die charakteristischen Würzeln einer linearen Sub￾stitution mit einer Anwendung auf die Theorie der Integral gle￾ichungen [On the Characteristic Roots of a Linear Transforma￾tion with an Application to the Theory of Integral Equations],” Mathematische Annalen, Vol. 66, 1909, pp. 448–510. 49. M. Gu and S.C. Eisenstat, “A Divide-and-Conquer Algorithm for the Bidiagonal SVD,” SIAM J. Matrix Analysis and Applications, Vol. 16, 1995, pp. 79–92. 50. E. Beltrami, “Sulle funzioni bilineari [On Bilinear Functions],” Gior￾nale di Matematiche ad Uso degli Studenti Delle Universita, Vol. 11, 1873, pp. 98–106. An English translation by D. Boley is available in Tech. Report 90–37, Dept. of Computer Science, Univ. of Min￾nesota, Minneapolis, 1990. 51. C. Jordan, “Memoire sur les formes bilineaires [Memoir on Bilin￾ear Forms],” Journal de Mathematiques Pures et Appliquees, Deux￾ieme Serie, Vol. 19, 1874, pp. 35–54. 52. G.H. Golub and W. Kahan, “Calculating the Singular Values and Pseudo-Inverse of a Matrix,” SIAM J. Numerical Analysis, Vol. 2, 1965, pp. 205–224. 53. E.G. Kogbetliantz, “Solution of Linear Systems by Diagonaliza￾tion of Coefficients Matrix,” Quarterly of Applied Math., Vol. 13, 1955, pp. 123–132. 54. T.F. Chan, “Rank Revealing QR Factorizations,” Linear Algebra and Its Applications, Vol. 88/89, 1987, pp. 67–82. 55. G.W. Stewart, “An Updating Algorithm for Subspace Tracking,” IEEE Trans. Signal Processing, Vol. 40, 1992, pp. 1535–1541. 56. Z. Bai and J.W. Demmel, “Computing the Generalized Singular Value Decomposition,” SIAM J. Scientific and Statistical Comput￾ing, Vol. 14, 1993, pp. 1464–1468. 57. C.B. Moler and G.W. Stewart, “An Algorithm for Generalized Ma￾trix Eigenvalue Problems,” SIAM J. Numerical Analysis, Vol. 10, 1973, pp. 241–256. G.W. Stewart is a professor in the Department of Computer Science and in the Institute for Advanced Computer Studies at the University of Maryland. His research interests focus on numerical linear algebra, perturbation theory, and rounding-error analysis. He earned his PhD from the University of Tennessee, where his advisor was A.S. Householder. Contact Stew￾art at the Dept. of Computer Science, Univ. of Mary￾land, College Park, MD 20742; stewart@cs.umd.edu. Scientific Programming will return in the next issue of CiSE

已到末页,全文结束
刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档