北京大学:《搜索引擎 Search Engines》课程教学资源(PPT讲稿)Evaluating Search Engines(Search Engines Information Retrieval in Practice)
Evaluating Search Engines in chapter 8 of the book Search Engines Information Retrieval in Practice http://net. Hongfei Yan School of EECS, peking University 3/28/2011 Refer to the book s slides
Evaluating Search Engines in chapter 8 of the book Search Engines Information Retrieval in Practice http://net.pku.edu.cn/~course/cs410/2011/ Hongfei Yan School of EECS, Peking University 3/28/2011 Refer to the book’s slides
08: Evaluating Search Engines 8. 1 Why Evaluate 8.2 The Evaluation Corpus 8.3 Logging + 8.4 Effectiveness Metrics(+) 8.5 Efficiency Metrics 8.6 Training, Testing, and statistics +) 8.7 The bottom Line skip) 3/N
08: Evaluating Search Engines 8.1 Why Evaluate 8.2 The Evaluation Corpus 8.3 Logging (+) 8.4 Effectiveness Metrics (+) 8.5 Efficiency Metrics 8.6 Training, Testing, and Statistics (+) 8.7 The Bottom Line (skip) 3/N
Search engine design and the core information retrieval issues Relevance Performance Effective ranking -Efficient search and indexing Evaluation Incorporating new data -Testing and measuring Coverage and freshness Information needs Scalability User interaction Growing with data and users Adaptability Tuning for applications Specific problems E.g…spam 4/N
Search engine design and the core information retrieval issues Relevance -Effective ranking Evaluation -Testing and measuring Information needs -User interaction Performance -Efficient search and indexing Incorporating new data -Coverage and freshness Scalability -Growing with data and users Adaptability -Tuning for applications Specific problems -E.g., spam 4/N
Evaluation Evaluation is key to building effective and efficient search engines measurement usually carried out in controlled laboratory experiments online testing can also be done Effectiveness, efficiency and cost are related e. g if we want a particular level of effectiveness and efficiency this will determine the cost of the system configuration efficiency and cost targets may impact effectiveness 5/N
Evaluation • Evaluation is key to building effective and efficientsearch engines – measurement usually carried out in controlled laboratory experiments – online testing can also be done • Effectiveness, efficiency and cost are related – e.g., if we want a particular level of effectiveness and efficiency, this will determine the cost of the system configuration – efficiency and cost targets may impact effectiveness 5/N
08: Evaluating Search Engines 8. 1 Why Evaluate 8.2 The Evaluation Corpus 8.3 Logging 8. 4 Effectiveness metrics 8.5 Efficiency Metrics 8.6 Training, Testing, and Statistics 6/N
08: Evaluating Search Engines 8.1 Why Evaluate 8.2 The Evaluation Corpus 8.3 Logging 8.4 Effectiveness Metrics 8.5 Efficiency Metrics 8.6 Training, Testing, and Statistics 6/N
Evaluation Corpus Test collections consisting of documents, queries, and relevance judgments, e. g. CACM: Titles and abstracts from the communications of the acm from 1958-1979. Queries and relevance judgments generated by computer scientists. AP: Associated press newswire documents from 1988 1900 ( from tREC disks 1-3 ). Queries are the title fields from TREC topics 51-150 Topics and relevance judgments generated by government information analysts. GOV2: Web pages crawled from websites in the. gov domain during early 2004. Queries are the title fields from TREC topics 701-850 Topics and relevance judgments generated by government analysts 7/N
Evaluation Corpus • Test collections consisting of documents, queries, and relevance judgments, e.g., – CACM: Titles and abstracts from the Communications of the ACM from 1958-1979. Queries and relevance judgments generated by computer scientists. – AP: Associated Press newswire documents from 1988- 1900 (from TREC disks 1-3). Queries are the title fields from TREC topics 51-150. Topics and relevance judgments generated by government information analysts. – GOV2: Web pages crawled from websites in the .gov domain during early 2004. Queries are the title fields from TREC topics 701-850. Topics and relevance judgments generated by government analysts. 7/N
Test Collections Collection Number of Size Average number documents of words/doc CACM 3.204 2.2Mb 64 AP 242,91807Gb 474 GOV225,205,179426Gb 1073 Collection Number of Average number of Average number of queries words/query relevant docs/query CACM 13.0 16 AP 100 43 GOV2 150 3.1 180 8/N
Test Collections 8/N
TREC Topic Example Number: 794 pet therapy description How are pets or animals used in therapy for humans and what are the benefits? narrative Relevant documents must include details of how pet-or animal-assisted therapy is or has been used relevant details include information about pet therapy programs, descriptions of the circumstances in which pet therapy is used, the benefits of this type of therapy the degree of success of this therapy and any laws or regulations governing it. 9/N
TREC Topic Example 9/N
Relevance Judgments Obtaining relevance judgments is an expensive, time-consuming process who does it? what are the instructions? what is the level of agreement? TREC judgments depend on task being evaluated generally binary agreement good because of"narrative 10/N
Relevance Judgments • Obtaining relevance judgments is an expensive, time-consuming process – who does it? – what are the instructions? – what is the level of agreement? • TREC judgments – depend on task being evaluated – generally binary – agreement good because of “narrative” 10/N
Pooling Exhaustive judgments for all documents in a collection is not practical Pooling technique is used in TREC top k results for TREC, k varied between 50 and 200) from the rankings obtained by different search engines for retrieval algorithms are merged into a pool duplicates are removed documents are presented in some random order to the relevance judges Produces a large number of relevance judgments for each query, although still incomplete 1/N
Pooling • Exhaustive judgments for all documents in a collection is not practical • Pooling technique is used in TREC – top k results (for TREC, k varied between 50 and 200) from the rankings obtained by different search engines (or retrieval algorithms) are merged into a pool – duplicates are removed – documents are presented in some random order to the relevance judges • Produces a large number of relevance judgments for each query, although still incomplete 11/N
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 香港浸会大学:《Data Communications and Networking》课程教学资源(PPT讲稿)Chapter 4 Transmission Media.ppt
- 《EDA技术》实用教程(PPT讲稿)第5章 QuartusII 应用向导.ppt
- 《数字图像处理 Digital Image Processing》课程教学资源(PPT课件讲稿)第2章 图像分析.ppt
- 东南大学计算机学院:《操作系统概念 OPERATING SYSTEM CONCEPTS》课程教学资源(PPT课件)Operating-System Structures.ppt
- 《计算机组装与维修》课程教学资源(PPT课件讲稿)第二章 计算机系统维护维修工具使用.ppt
- 《数据结构》课程教学资源:课程PPT教学课件:绪论(数据结构讨论的范畴、基本概念、算法和算法的量度).ppt
- 中国人民大学:Similarity Measures in Deep Web Data Integration.ppt
- 清华大学:ICCV 2015 RIDE:Reversal Invariant Descriptor Enhancement.pptx
- 中国科学技术大学计算机学院:《高级操作系统 Advanced Operating System》课程教学资源(PPT课件)第四章 分布式进程和处理机管理(分布式处理机分配算法).ppt
- 香港科技大学:Web-log Mining:from Pages to Relations.ppt
- 《PowerPoint》课程PPT教学课件:第六章 使用PowerPoint创建演示文稿.ppt
- 南京大学:《嵌入式网络物理系统》课程教学资源(PPT讲稿)时光自动机 Timed Automata.ppt
- 《C程序设计》课程PPT电子教案:第一章 概述.ppt
- 《算法设计与分析 Design and Analysis of Algorithms》课程PPT课件:Tutorial 10.pptx
- 中国科学技术大学:《现代密码学理论与实践》课程教学资源(PPT课件讲稿)第1章 引言(主讲:苗付友).pptx
- 东南大学:《数据结构》课程教学资源(PPT课件讲稿)随机算法(主讲:方效林).pptx
- 动态内存分配器的实现(实验PPT讲稿).pptx
- Java面向对象程序设计:Java的接口(PPT讲稿).pptx
- 赣南师范大学:《计算机网络技术》课程教学资源(PPT课件讲稿)第十章 Internet概述.ppt
- 《编译原理》课程教学资源(PPT课件讲稿)第四章 语法分析——自上而下分析.ppt
- 西安电子科技大学:《8086CPU 指令系统》课程教学资源(PPT课件讲稿,共五部分,王晓甜).pptx
- 北京师范大学网络教育:《计算机应用基础》课程教学资源(PPT讲稿)第8章 计算机安全、第9章 多媒体技术.pptx
- 沈阳理工大学:《Java程序设计基础》课程教学资源(PPT课件讲稿)第1章 创建Java开发环境.ppt
- 成都信息工程大学(成都信息工程学院):分层分流培养个性发展的计算机卓越工程师——专业课分层教学探索与实践.ppt
- 厦门大学计算机科学系:《大数据技术原理与应用》课程教学资源(PPT课件)第十章 数据可视化.ppt
- SIGCOMM 2002:New Directions in Traffic Measurement and Accounting.ppt
- 计算机问题求解(PPT讲稿)图论中的其它专题.pptx
- 西安电子科技大学:《操作系统 Operating Systems》课程教学资源(PPT课件讲稿)Chapter 08 多处理器系统 Multiple Processor Systems.ppt
- 国家十一五规划教材:《电子商务案例分析》课程教学资源(PPT课件)第11章 网络社区模式案例分析.ppt
- 南京大学:《计算机图形学》课程教学资源(PPT课件讲稿)计算机图形学引言(主讲:路通).ppt
- 北京大学:浅谈计算机研究的层次与境界(李振华).pptx
- 电子工业出版社:《计算机网络》课程教学资源(第五版,PPT课件讲稿)第七章 网络安全.ppt
- 西安电子科技大学:《计算机网络 Computer Networks》课程教学资源(PPT课件讲稿)基于CORBA的分布式平台(CORBA编程-Hello World例程).ppt
- 《软件开发》课程PPT教学课件:Chapter 16 异常处理 Exception Handling.ppt
- 《Adobe Photoshop CS》软件教程(PPT讲稿)第13章 使用路径.ppt
- Virtual Topologies - Faculty of Science, HKBU.ppt
- 《Java语言程序设计》课程教学资源(PPT课件讲稿)第三章 面向对象特征.ppt
- 中国科学技术大学:《算法基础》课程教学资源(PPT课件讲稿)第七讲 顺序统计学(主讲人:吕敏).pptx
- 清华大学出版社:《C语言程序设计》课程教学资源(PPT课件讲稿)第7章 用户自定义函数.ppt
- 清华大学:Mandarin Pronunciation Variation Modeling.ppt