中国高校课件下载中心 》 教学资源 》 大学文库

《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源(PPT讲稿)Lecture 13 Matrix Factorization and Latent Semantic Indexing

文档信息
资源类别:文库
文档格式:PPT
文档页数:47
文件大小:653KB
团购合买:点击进入团购
内容简介
《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源(PPT讲稿)Lecture 13 Matrix Factorization and Latent Semantic Indexing
刷新页面文档预览

Matrix Factorization and Latent Semantic Indexing Web Search and Mining Lecture 13: Matrix Factorization and latent Semantic Indexing

Matrix Factorization and Latent Semantic Indexing 1 Lecture 13: Matrix Factorization and Latent Semantic Indexing Web Search and Mining

Matrix Factorization and Latent Semantic Indexing Topic a Matrix factorization matrix decomposition Latent Semantic Indexing( LSi) Term-document matrices are very large But the number of topics that people talk about is small (in some sense) Clothes, movies, politics Can we represent the term-document space by a lower dimensional latent space?

Matrix Factorization and Latent Semantic Indexing 2 Topic ▪ Matrix factorization = matrix decomposition ▪ Latent Semantic Indexing (LSI) ▪ Term-document matrices are very large ▪ But the number of topics that people talk about is small (in some sense) ▪ Clothes, movies, politics, … ▪ Can we represent the term-document space by a lower dimensional latent space?

Matrix Factorization and Latent Semantic Indexing Background Linear Algebra background

Matrix Factorization and Latent Semantic Indexing 3 Linear Algebra Background Background

Matrix Factorization and Latent Semantic Indexing Background Eigenvalues eigenvectors Eigenvectors (for a square m xm matrix s) Sy=Av Example 2 (right)eigenvector eigenvalue(4 0)(2, 2 ∈R≠0A∈R a How many eigenvalues are there at most? Sv=v<→(S-Av=0 only has a non-zero solution if $ -AI=0 This is a mth order equation in n which can have at most m distinct solutions (roots of the characteristic polynomial)-can be complex even though S is real

Matrix Factorization and Latent Semantic Indexing 4 Eigenvalues & Eigenvectors ▪ Eigenvectors (for a square mm matrix S) ▪ How many eigenvalues are there at most? only has a non-zero solution if This is a mth order equation in λ which can have at most m distinct solutions (roots of the characteristic polynomial) – can be complex even though S is real. (right) eigenvector eigenvalue Example Background

Matrix Factorization and Latent Semantic Indexing Background Matrix-vector multiplication 3000 S=0200 has eigenvalues 30, 20, 1 with corresponding eigenvectors 001 On each eigenvector, S acts as a multiple of the identity matrix: but as a different multiple on each Any vector(say X= 1)can be viewed as a combination of the eigenvectors X=2v1+4v+6v

Matrix Factorization and Latent Semantic Indexing 5 Matrix-vector multiplication           = 0 0 1 0 20 0 30 0 0 S has eigenvalues 30, 20, 1 with corresponding eigenvectors           = 0 0 1 1 v           = 0 1 0 2 v           = 1 0 0 3 v On each eigenvector, S acts as a multiple of the identity matrix: but as a different multiple on each. Any vector (say x= ) can be viewed as a combination of the eigenvectors: x = 2v1 + 4v2 + 6v3           6 4 2 Background

Matrix Factorization and Latent Semantic Indexing Background Matrix vector multiplication Thus a matrix-vector multiplication such as Sx s, x as in the previous slide can be rewritten in terms of the eigenvalues/vectors. Sx=S(2v1+42+6v3) Sx=2Sv1+4S2+6S3=21v1+42+63 Sx=60v1+80v2+6v 3 Even though x is an arbitrary vector the action of s on x is determined by the eigenvalues/vectors

Matrix Factorization and Latent Semantic Indexing 6 Matrix vector multiplication ▪ Thus a matrix-vector multiplication such as Sx (S, x as in the previous slide) can be rewritten in terms of the eigenvalues/vectors: ▪ Even though x is an arbitrary vector, the action of S on x is determined by the eigenvalues/vectors.  Sx = S(2v1 + 4v2 + 6v 3 ) Sx = 2Sv1 + 4Sv2 + 6Sv 3 = 21 v1 + 42 v2 + 6 3 v 3 Sx = 60v1 + 80v2 + 6v 3 Background

Matrix Factorization and Latent Semantic Indexing Background Matrix vector multiplication Suggestion: the effect of small"eigenvalues is small If we ignored the smallest eigenvalue(1), then instead of (60 we would get 60 80 80 6 These vectors are similar (in cosine similarity etc

Matrix Factorization and Latent Semantic Indexing 7 Matrix vector multiplication ▪ Suggestion: the effect of “small” eigenvalues is small. ▪ If we ignored the smallest eigenvalue (1), then instead of we would get ▪ These vectors are similar (in cosine similarity, etc.)  60 80 6            60 80 0           Background

Matrix Factorization and Latent Semantic Indexing Background Eigenvalues Eigenvectors For symmetric matrices, eigenvectors for distinct eigenvalues are orthogonal All eigenvalues of a real symmetric matrix are real All eigenvalues of a positive semidefinite matrix are non-negative 8

Matrix Factorization and Latent Semantic Indexing 8 Eigenvalues & Eigenvectors Sv{1,2}={1,2}v{1,2} ,and 121•v2=0 For symmetric matrices, eigenvectors for distinct eigenvalues are orthogonal All eigenvalues of a real symmetric matrix are real.  n ,w TSw0, then ifSv=v0 All eigenvalues of a positive semidefinite matrix are non-negative Background

Matrix Factorization and Latent Semantic Indexing Background Example Let 2 Real, symmetric 2- Then s-n= 12-2 S-A|=(2-1)2-1=0 The eigenvalues are 1 and 3 (nonnegative, real The eigenvectors are orthogonal (and real Plug in these values and solve for eigenvectors

Matrix Factorization and Latent Semantic Indexing 9 Plug in these values and solve for eigenvectors. Example ▪ Let ▪ Then ▪ The eigenvalues are 1 and 3 (nonnegative, real). ▪ The eigenvectors are orthogonal (and real):       = 1 2 2 1 S | | (2 ) 1 0. 1 2 2 1 2 − = − − =        − − − =      S I S I         − 1 1         1 1 Real, symmetric. Background

Matrix Factorization and Latent Semantic Indexing Background Eigen/diagonal Decomposition etsEra be a square matrix with m linearly independent eigenvectors a"non-defective"matrix) Theorem: Exists an eigen decomposition Unique diagonal for S= UAU distinct (cf. matrix diagonalization theorem) eigen values Columns of u are eigenvectors of S Diagonal elements of a are eigenvalues of s A=diag(入1,…,Mm),A2≥入2+1

Matrix Factorization and Latent Semantic Indexing 10 ▪ Let be a square matrix with m linearly independent eigenvectors (a “non-defective” matrix) ▪ Theorem: Exists an eigen decomposition ▪ (cf. matrix diagonalization theorem) ▪ Columns of U are eigenvectors of S ▪ Diagonal elements of are eigenvalues of Eigen/diagonal Decomposition diagonal Unique for distinct eigen￾values Background

刷新页面下载完整文档
VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
相关文档