《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源（PPT讲稿）Lecture 13 Matrix Factorization and Latent Semantic Indexing

文档信息

资源类别：文库
文档格式：PPT
文档页数：47
文件大小：653KB
团购合买：点击进入团购

内容简介

Matrix Factorization and Latent Semantic Indexing Web Search and Mining Lecture 13: Matrix Factorization and latent Semantic Indexing

Matrix Factorization and Latent Semantic Indexing 1 Lecture 13: Matrix Factorization and Latent Semantic Indexing Web Search and Mining

Matrix Factorization and Latent Semantic Indexing Topic a Matrix factorization matrix decomposition Latent Semantic Indexing( LSi) Term-document matrices are very large But the number of topics that people talk about is small (in some sense) Clothes, movies, politics Can we represent the term-document space by a lower dimensional latent space?

Matrix Factorization and Latent Semantic Indexing 2 Topic ▪ Matrix factorization = matrix decomposition ▪ Latent Semantic Indexing (LSI) ▪ Term-document matrices are very large ▪ But the number of topics that people talk about is small (in some sense) ▪ Clothes, movies, politics, … ▪ Can we represent the term-document space by a lower dimensional latent space?

Matrix Factorization and Latent Semantic Indexing Background Linear Algebra background

Matrix Factorization and Latent Semantic Indexing 3 Linear Algebra Background Background

Matrix Factorization and Latent Semantic Indexing Background Eigenvalues eigenvectors Eigenvectors (for a square m xm matrix s) Sy=Av Example 2 (right)eigenvector eigenvalue(4 0)(2, 2 ∈R≠0A∈R a How many eigenvalues are there at most? Sv=v<→(S-Av=0 only has a non-zero solution if $ -AI=0 This is a mth order equation in n which can have at most m distinct solutions (roots of the characteristic polynomial)-can be complex even though S is real

Matrix Factorization and Latent Semantic Indexing 4 Eigenvalues & Eigenvectors ▪ Eigenvectors (for a square mm matrix S) ▪ How many eigenvalues are there at most? only has a non-zero solution if This is a mth order equation in λ which can have at most m distinct solutions (roots of the characteristic polynomial) – can be complex even though S is real. (right) eigenvector eigenvalue Example Background

Matrix Factorization and Latent Semantic Indexing Background Matrix-vector multiplication 3000 S=0200 has eigenvalues 30, 20, 1 with corresponding eigenvectors 001 On each eigenvector, S acts as a multiple of the identity matrix: but as a different multiple on each Any vector(say X= 1)can be viewed as a combination of the eigenvectors X=2v1+4v+6v

Matrix Factorization and Latent Semantic Indexing 5 Matrix-vector multiplication           = 0 0 1 0 20 0 30 0 0 S has eigenvalues 30, 20, 1 with corresponding eigenvectors           = 0 0 1 1 v           = 0 1 0 2 v           = 1 0 0 3 v On each eigenvector, S acts as a multiple of the identity matrix: but as a different multiple on each. Any vector (say x= ) can be viewed as a combination of the eigenvectors: x = 2v1 + 4v2 + 6v3           6 4 2 Background

Matrix Factorization and Latent Semantic Indexing Background Matrix vector multiplication Thus a matrix-vector multiplication such as Sx s, x as in the previous slide can be rewritten in terms of the eigenvalues/vectors. Sx=S(2v1+42+6v3) Sx=2Sv1+4S2+6S3=21v1+42+63 Sx=60v1+80v2+6v 3 Even though x is an arbitrary vector the action of s on x is determined by the eigenvalues/vectors

Matrix Factorization and Latent Semantic Indexing 6 Matrix vector multiplication ▪ Thus a matrix-vector multiplication such as Sx (S, x as in the previous slide) can be rewritten in terms of the eigenvalues/vectors: ▪ Even though x is an arbitrary vector, the action of S on x is determined by the eigenvalues/vectors.  Sx = S(2v1 + 4v2 + 6v 3 ) Sx = 2Sv1 + 4Sv2 + 6Sv 3 = 21 v1 + 42 v2 + 6 3 v 3 Sx = 60v1 + 80v2 + 6v 3 Background

Matrix Factorization and Latent Semantic Indexing Background Matrix vector multiplication Suggestion: the effect of small"eigenvalues is small If we ignored the smallest eigenvalue(1), then instead of (60 we would get 60 80 80 6 These vectors are similar (in cosine similarity etc

Matrix Factorization and Latent Semantic Indexing 7 Matrix vector multiplication ▪ Suggestion: the effect of “small” eigenvalues is small. ▪ If we ignored the smallest eigenvalue (1), then instead of we would get ▪ These vectors are similar (in cosine similarity, etc.)  60 80 6            60 80 0           Background

Matrix Factorization and Latent Semantic Indexing Background Eigenvalues Eigenvectors For symmetric matrices, eigenvectors for distinct eigenvalues are orthogonal All eigenvalues of a real symmetric matrix are real All eigenvalues of a positive semidefinite matrix are non-negative 8

Matrix Factorization and Latent Semantic Indexing 8 Eigenvalues & Eigenvectors Sv{1,2}={1,2}v{1,2} ,and 121•v2=0 For symmetric matrices, eigenvectors for distinct eigenvalues are orthogonal All eigenvalues of a real symmetric matrix are real.  n ,w TSw0, then ifSv=v0 All eigenvalues of a positive semidefinite matrix are non-negative Background

Matrix Factorization and Latent Semantic Indexing Background Example Let 2 Real, symmetric 2- Then s-n= 12-2 S-A|=(2-1)2-1=0 The eigenvalues are 1 and 3 (nonnegative, real The eigenvectors are orthogonal (and real Plug in these values and solve for eigenvectors

Matrix Factorization and Latent Semantic Indexing 9 Plug in these values and solve for eigenvectors. Example ▪ Let ▪ Then ▪ The eigenvalues are 1 and 3 (nonnegative, real). ▪ The eigenvectors are orthogonal (and real):       = 1 2 2 1 S | | (2 ) 1 0. 1 2 2 1 2 − = − − =        − − − =      S I S I         − 1 1         1 1 Real, symmetric. Background

Matrix Factorization and Latent Semantic Indexing Background Eigen/diagonal Decomposition etsEra be a square matrix with m linearly independent eigenvectors a"non-defective"matrix) Theorem: Exists an eigen decomposition Unique diagonal for S= UAU distinct (cf. matrix diagonalization theorem) eigen values Columns of u are eigenvectors of S Diagonal elements of a are eigenvalues of s A=diag(入1,…,Mm),A2≥入2+1

Matrix Factorization and Latent Semantic Indexing 10 ▪ Let be a square matrix with m linearly independent eigenvectors (a “non-defective” matrix) ▪ Theorem: Exists an eigen decomposition ▪ (cf. matrix diagonalization theorem) ▪ Columns of U are eigenvectors of S ▪ Diagonal elements of are eigenvalues of Eigen/diagonal Decomposition diagonal Unique for distinct eigenvalues Background

共47页，可试读16页，点击继续阅读 ↓

刷新页面下载完整文档

VIP每日下载上限内不扣除下载券和下载次数；
按次数下载不扣除下载券；
注册用户24小时内重复下载只扣除一次；
顺序：VIP每日次数-->可用次数-->下载券；

点击下载完整版文档（PPT）