北京大学:《信息检索》课程教学资源(PPT课件讲稿)Retrieval Models

Outline Vector Space Model (VSM) Latent Semantic Model (LSI) ·Language Model(LM) CCF-ADL at Zhengzhou University, 2 June25-27,2010
Outline • Vector Space Model (VSM) • Latent Semantic Model (LSI) • Language Model (LM) 2 CCF-ADL at Zhengzhou University, June 25-27, 2010

Simple flow of retrieval process Information Need Text Objects 2 Representation Representation Query Indexed Objects Comparison Evaluation /Feedback Retrieved Objects CCF-ADL at Zhengzhou University June25-27,2010
CCF -ADL at Zhengzhou University, June 25 -27, 2010 3

文件E)编辑(E)查看)历史(⑤)书签但)工具(①)帮助仙) http://www.google.com/search?hl-en&newwindow=-18q-latent+semantictindexing&aq-0e&oq Google 4 in Zob..James Mcc-.Chengxian..图百度搜索_Gmail-nb.Conferenc..web Base. Pregel 如何修改p.laten.…区 Web Images Videos Maps News Shopping Gmail more kyhhdm@gmail.com|Web History I Settings Google latent semantic indexing Search Advanced Search weh田Show options. Results 1-10 of about 129,000 for latent semantic indexing.(0.31 seco Latent semantic indexing-Wikipedia,the free encyclopedia Latent Semantic Indexing(LSI)is an indexing and retrieval method that uses a mathematical technique called Singular Value Decomposition (SVD)to identify... Relevance Feedback Benefits of LSI-LSI Timeline-Mathematics of LSI en.wikipedia.org/wiki/Latent_semantic_indexing-Cached-Similar- Query Expansion Latent semantic analysis-Wikipedia,the free encyclopedia CO:2-9.http://Isi.research.telcordia.com/Isi/papers/JASIS90.pdf.Original article where the model was first exposed.Michael Berry.S.T.Dumais,.. Occurrence matrix-Applications-Rank lowering-Derivation en.wikipedia.org/wiki/Latent_semantic_analysis-Cached-Similar- Google Semantically Related Words Latent Semantic Indexing .. Google recently strongly promoted the semantic relationships of words in their algorithm. www.seobook.com/archives/000657.shtml-Cached-Similar-x Latent Semantic Indexing Latent semantic indexing adds an important step to the document indexing process.In addition to recording which keywords a document contains,.. www.seobook.com/lsi/lsa_definition.htm-Cached Similar- LSI-Latent Semantic Indexing Web Site January 12,2006 podcast interview of Michael W.Berry discussing LSI on the Good Karma Show hosted by Greg Niland (aka GoodROl)at WebmasterRadio.fm... ww.cs.utk.edu-lsi/-Cached-Similar-⊙图☒ Laterit Semantic Indexingrsity, 完成tne25-27,2010
Relevance Feedback Query Expansion CCF-ADL at Zhengzhou University, June 25-27, 2010 4

Vector Space Model
Vector Space Model

Documents as vectors Di D2 D3 Da Ds Do 中国 4.1 0.0 3.7 5.9 3.1 0.0 文化 4.5 4.5 0 0 11.6 0 日本 0 3.5 2.9 0 2.1 3.9 留学生 0 3.1 5.1 12.8 0 0 教育 2.9 0 0 2.2 0 0 北京 7.1 0 0 0 4.4 3.8 每一个文档j能够被看作一个向量,每个term是一个维 度,取值为log-scaled tf.idf So we have a vector space -terms are axes docs live in this space -高维空间:即使作stemming,.may have20,000+dimensions 6
Documents as vectors • 每一个文档 j 能够被看作一个向量,每个term 是一个维 度,取值为log-scaled tf.idf • So we have a vector space – terms are axes – docs live in this space – 高维空间:即使作stemming, may have 20,000+ dimensions D1 D2 D3 D4 D5 D6 … 中国 4.1 0.0 3.7 5.9 3.1 0.0 文化 4.5 4.5 0 0 11.6 0 日本 0 3.5 2.9 0 2.1 3.9 留学生 0 3.1 5.1 12.8 0 0 教育 2.9 0 0 2.2 0 0 北京 7.1 0 0 0 4.4 3.8 … 6

Intuition t3 d2 d3 ,d 8 中 t da Postulate:在vector space中“close together'"的 文档会talk about the same things. 用例:Query-by-example,Free Text query as vector CCF-ADL at Zhengzhou University,June 25-27,2010
Intuition Postulate: 在vector space中“close together” 的 文档会talk about the same things. t1 d2 d1 d3 d4 d5 t3 t2 θ φ 用例:Query-by-example,Free Text query as vector CCF-ADL at Zhengzhou University, June 25-27, 2010 7

Cosine similarity t3 d2 。向量d,和d,的“closeness” 可以用它们之间的夹角大 小来度量 -d ·具体的,可用cosine of the 8 angle x来计算向量相似度. 向量按长度归一化 Normalization 2 a=v∑w2=1 sim(djdk)= d d ∑e小 V∑∑暖 8
Cosine similarity 1 1 , 2 = = = M i i j d j w • 向量d1和d2的“closeness” 可以用它们之间的夹角大 小来度量 • 具体的,可用cosine of the angle x来计算向量相似度. • 向量按长度归一化 Normalization t 1 d 2 d 1 t 3 t 2 θ = = = = = M i i k M i i j M i i j i k j k j k j k w w w w d d d d sim d d 1 2 , 1 2 , 1 , , ( , ) 8

Latent Semantic Model
Latent Semantic Model

Vector Space Model:Pros Automatic selection of index terms Partial matching of queries and documents (dealing with the case where no document contains all search terms) Ranking according to similarity score (dealing with large result sets) Term weighting schemes (improves retrieval performance) ·Various extensions -Document clustering Relevance feedback(modifying query vector) Geometric foundation CCF-ADL at Zhengzhou University, 10 June25-27,2010
Vector Space Model: Pros • Automatic selection of index terms • Partial matching of queries and documents (dealing with the case where no document contains all search terms) • Ranking according to similarity score (dealing with large result sets) • Term weighting schemes (improves retrieval performance) • Various extensions – Document clustering – Relevance feedback (modifying query vector) • Geometric foundation CCF-ADL at Zhengzhou University, June 25-27, 2010 10

I guess this page is about a blackberry...? plackberry blackberry blackberry blackhemy CCF-ADL at Zhengzhou University, 11 June25-27,2010
CCF -ADL at Zhengzhou University, June 25 -27, 2010 11
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 北京大学:《信息检索》课程教学资源(PPT课件讲稿)Crawling the Web.ppt
- 北京大学:《信息检索》课程教学资源(PPT课件讲稿)Web Search.ppt
- 北京大学:《信息检索》课程教学资源(PPT课件讲稿)Course Overview(主讲:闫宏飞).ppt
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 01 Introdution(主讲:吉建民).pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 15 智能机器人系统介绍.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 14 Reinforcement Learning.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 13 神经网络与深度学习.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 09 AI Planning.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 08 First-Order Logic and Inference in FOL.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 11 马尔可夫决策过程.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 10 Uncertainty and Bayesian Networks.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 07 Logical Agents.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 06 Game Playing.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 05 Constraint Satisfaction Problems.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 04 Informed Search.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 03 Solving Problems by Searching.pdf
- 中国科学技术大学:《人工智能基础》课程教学资源(课件讲稿)Lecture 02 Intelligent Agents.pdf
- 《Artificial Intelligence:A Modern Approach》教学资源(PPT课件,英文版)Chapter 9-Inference in first-order logic.ppt
- 《Artificial Intelligence:A Modern Approach》教学资源(PPT课件,英文版)Chapter 8-First-Order Logic.ppt
- 《Artificial Intelligence:A Modern Approach》教学资源(PPT课件,英文版)Chapter 7-Logical Agents.ppt
- 北京大学:《信息检索》课程教学资源(PPT课件讲稿)Essential Background.ppt
- 哈尔滨工业大学:《信息检索》课程教学资源(课件讲义)文本分类 Text Categorization(主讲:刘挺).pdf
- 哈尔滨工业大学:《信息检索》课程教学资源(课件讲义)信息过滤(主讲:刘挺).pdf
- 哈尔滨工业大学:《信息检索》课程教学资源(课件讲义)信息检索模型 IRModel.pdf
- 哈尔滨工业大学:《信息检索》课程教学资源(课件讲义)信息检索概述.pdf
- 哈尔滨工业大学:《信息检索》课程教学资源(课件讲义)搜索引擎技术 SearchEngine.pdf
- 《统计自然语言处理》课程教学资源(PPT课件讲稿)第7章 汉语自动分词与词性标注.ppt
- 北京大学:《信息检索》课程PPT课件讲稿(自然语言处理)01 Introduction(主讲:彭波)The CCF Advanced Disciplines Lectures.ppt
- 北京大学:《信息检索》课程PPT课件讲稿(自然语言处理)02 Link Analysis.ppt
- 北京大学:《信息检索》课程PPT课件讲稿(自然语言处理)03 Web Spam.ppt
- 北京大学:《信息检索》课程PPT课件讲稿(自然语言处理)04 Recommendation System.ppt
- 北京大学:《信息检索》课程PPT课件讲稿(自然语言处理)05 Infrastructure and Cloud.ppt
- 河南科技学院:信息工程学院本科课程教学大纲汇编(计算机科学与技术专业).pdf
- 广东茂名农林科技职业学院:计算机网络技术人才培养方案(2020级).pdf
- 广东茂名农林科技职业学院:计算机网络技术专业人才培养方案(2021级).pdf
- 广东茂名农林科技职业学院:动漫制作技术专业人才培养方案(2020级).pdf
- 南京农业大学:《面向对象程序设计实验》课程教学大纲 Experiment in Object-Oriented Programming.pdf
- 广东茂名农林科技职业学院:电子商务专业人才培养方案(2019级).pdf
- 中国科学技术大学:《数据库基础》课程教学资源(PPT课件讲稿)第一章 绪论(主讲:袁平波).pps
- 中国科学技术大学:《数据库基础》课程教学资源(PPT课件讲稿)第二章 关系数据库.pps