北京航空航天大学:Graph Search - a New Paradigm for Social Computing

Graph Search: a New Paradigm for Social Computing Shuai ma 京航宫航天大学 BEIHANG UNIVERSITY
Shuai Ma Graph Search: a New Paradigm for Social Computing

The soil Food Web 6 Graphs are everywhere, and quite a few are huge graphs 山东省高遮公略图一 员普2 ▲圆 原照R 2
2 Graphs are everywhere, and quite a few are huge graphs!

Graph Search- Why Bother? Graph Searching A+(SOL+A arCI Google/ File systems Databases World wide web Social Networks File systems -1960s: very simple search functionalities Databases-mid 1960S: SQL language s·鱼 World wide Web-1990'S: keyword search engines e f8 in& Social networks-late 1990s q⊙a Blv Graphs have more expressive power, compared with RDB& XML 2. Relationships become important for search - Google Knowledge Graph Graph search is a new paradigm for social computing! 3
Graph Search - Why Bother? 3 • File systems - 1960’s: very simple search functionalities • Databases - mid 1960’s:SQL language • World Wide Web - 1990’s:keyword search engines • Social networks - late 1990’s: File systems Databases World Wide Web Graph search is a new paradigm for social computing! Social Networks 1. Graphs have more expressive power, compared with RDB & XML. 2. Relationships become important for search – Google Knowledge Graph

Interesting Coincidence! ■S|GMoD+VLDB+|CDE 2000200120022003200420052006200720082009201020112012 Social computing Web 2.0 DB people started working on graphs at around the same time!
Interesting Coincidence! 4 DB people started working on graphs at around the same time! 0 5 10 15 20 25 30 35 40 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 SIGMOD + VLDB + ICDE Social computing & Web 2.0

Outline Application scenarios What is graph search? Three types of graph search Problems and challenges · Related techniques Summary
Outline 5 • Application scenarios • What is graph search? • Three types of graph search • Problems and challenges • Related techniques • Summary

Application Scenarios
6 Application Scenarios

Application Scenarios Complex object identification Data quality Real-life data is often dirty 1%0-5% of business data contains errors Dirty costs us businesses 600 billion dollars each year gartner Wrong price data in retail databases alone costs US consumers $2.5 billion annually Data cleaning tools deliver an overall business value of more than 600 million GBP each year at BT Data cleaning FORRESTER Data repairing Record matching(aka object identification, entity resolution, data deduplication) Complex object identification Modeling complex objects as graphs
• Data quality – Real-life data is often dirty: 1%–5% of business data contains errors – Dirty costs us businesses 600 billion dollars each year – Wrong price data in retail databases alone costs US consumers $2.5 billion annually – Data cleaning tools deliver an overall business value of more than ‘‘600 million GBP’’ each year at BT. • Data cleaning – Data repairing – Record matching (aka. object identification, entity resolution, data deduplication) • Complex object identification – Modeling complex objects as graphs Application Scenarios 7 Complex object identification

Application Scenarios Software plagiarism detection [131 Traditional plagiarism detection tools may not be applicable for serious software plagiarism problems A new tool based on graph pattern matching Represent the source codes as program dependence graphs [141 se graph pattern matching to detect plagiarism int sum(int array [], int court) ant 1, sun; 2: declaration. int sum o for (i= O; i< count: 1++)[ sum= add(sum, array[i])i 6: assignment, i=0(0: declaration, int count retm int add(int ant 5: control i< count 1: declaration. retum a+ b: 8. increment i++ 7: assignment, sum=0 9: assignment, sum=addo 3: declaration. int array 4: retum retum sum 10: call-site. add( sum. armaylil 8
Application Scenarios 8 • Traditional plagiarism detection tools may not be applicable for serious software plagiarism problems. • A new tool based on graph pattern matching – Represent the source codes as program dependence graphs [14] . – Use graph pattern matching to detect plagiarism. Software plagiarism detection [13]

Application Scenarios ransport routing【16 T Graph search is a common practice in transportation networks, due to the wide application of Location-Based Services Exam ple: Mark, a driver in the U.S. who wants to go from Irvine to Riverside in california If Mark wants to reach Riverside by his car in the shortest time, the problem can be expressed as the shortest path problem. Then by using existing methods, we can get the shortest path from irvine, ca to Riverside, CA traveling along State Route 261 If Mark drives a truck delivering hazardous materials may not be allowed to cross over some bridges or railroad crossings this time we can use a pattern graph containing specific route constraints(such as regular expressions)to find the optimal transport routes
Application Scenarios 9 • Graph search is a common practice in transportation networks, due to the wide application of Location-Based Services. • Example: Mark, a driver in the U.S. who wants to go from Irvine to Riverside in California. – If Mark wants to reach Riverside by his car in the shortest time, the problem can be expressed as the shortest path problem. Then by using existing methods, we can get the shortest path from Irvine, CA to Riverside, CA traveling along State Route 261. Transport routing [16] – If Mark drives a truck delivering hazardous materials may not be allowed to cross over some bridges or railroad crossings. This time we can use a pattern graph containing specific route constraints (such as regular expressions) to find the optimal transport routes

Application Scenarios Recommender systems [131 Recommendations have found its usage in many emerging specific applications, such as social matching systems Graph search is a useful tool for recommendations a headhunter wants to find a biologist (Bio)to help a group of software G engineers(sEs) analyze genetic data DM1 To do this, (s)he uses an expertise HRI Bio1 recommendation network g. as Bi depicted in g, where All Al2 v a node denotes a person labeled SE1 B102 Bio3 with expertise, and SE2 DM2 an edge indicates recommendation AlI DM Alk DMK e.g., HR, recommends Bio,, and Al, recommends DM1 10
Application Scenarios 10 • Recommendations have found its usage in many emerging specific applications, such as social matching systems. • Graph search is a useful tool for recommendations. Recommender systems [13] – A headhunter wants to find a biologist (Bio) to help a group of software engineers (SEs) analyze genetic data. – To do this, (s)he uses an expertise recommendation network G, as depicted in G, where ✓ a node denotes a person labeled with expertise, and ✓ an edge indicates recommendation, e.g., HR1 recommends Bio1 , and AI1 recommends DM1
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 清华大学:《计算机网络》课程教学资源(PPT课件讲稿)Lecture 4 Routing.pptx
- Homomorphic Secret Sharing:Low-End HSS from OWF、HSS for Branching Programs from DDH、The HSS Construction.ppsx
- 四川大学:软件设计工具(PPT课件讲稿)Software design tool.ppt
- 《图像处理与计算机视觉 Image Processing and Computer Vision》课程教学资源(PPT课件讲稿)Chapter 02 Image processing and computer vision(Camera models and parameters).pptx
- 《数据结构》课程教学资源(PPT课件讲稿)第九章 排序.ppt
- 福建工程学院:《软件工程》课程教学资源(实验指导书).doc
- 香港中文大学:Adaboost for building robust classifiers(PPT讲稿).pptx
- 《软件测试》课程教学资源(PPT讲稿)集成测试.pptx
- 《大学计算机基础》课程教学资源(PPT课件讲稿)第三章 字处理软件 Word2003.ppt
- 《现代操作系统 Modern Operating Systems》课程教学资源(PPT课件讲稿,Third Edition)Chapter 10 Case Study 1 LINUX.ppt
- 《微机原理与接口技术》课程教学资源(PPT课件讲稿)第1章 微型计算机基础概论.ppt
- 电子工业出版社:《计算机网络》课程教学资源(第五版,PPT课件讲稿)第八章 因特网上的音频/视频服务.ppt
- PARALLELISM IN HASKELL(Kathleen Fisher).pptx
- 南京大学:《数据结构 Data Structures》课程教学资源(PPT课件讲稿)第九章 排序.ppt
- 厦门大学:《大数据技术原理与应用》课程教学资源(PPT课件讲稿,2017)第9章 Spark.ppt
- 中国科学技术大学:《嵌入式系统设计》课程教学资源(PPT课件讲稿)第2章 ARM微处理器概述与编程模型(王行甫).ppt
- 电子工业出版社:《计算机网络》课程教学资源(第五版,PPT课件讲稿)第二章 物理层.ppt
- 南京大学:可信软件(PPT讲稿)认识、度量与评估.ppt
- 《C语言程序设计》课程电子教案(PPT课件讲稿)第六章 函数.ppt
- “互联网+”与“+互联网”(PPT讲稿).pptx
- 西南民族大学:《软件需求分析与总体设计》课程教学资源(PPT课件讲稿)软件总体(概要)设计.ppt
- 东南大学:《数据结构》课程教学资源(PPT课件讲稿)第五章 树(主讲:方效林).ppt
- 中国科学技术大学:《网络信息安全 NETWORK SECURITY》课程教学资源(PPT课件讲稿)第十章 入侵检测系统(主讲:肖明军).ppt
- 中国科学技术大学:QuickPass系统的排队问题(PPT讲座,谢瑶).ppt
- 《工程计算软件》课程教学资源(PPT课件讲稿)第四章 Maple简介.ppt
- 山东大学:《微机原理及单片机接口技术》课程教学资源(PPT课件讲稿)第六章 中断(主讲:刘忠国).ppt
- 中国传媒大学(北京广播学院):《计算机网络》课程教学资源(PPT课件讲稿)第五章 网络层 The Network Layer.ppt
- Introduction to XML IR(PPT讲稿).ppt
- 《计算机系统》课程教学资源(PPT课件讲稿)第六章 设备管理 Devices Management.ppt
- 《Excel实用技术基础》课程教学资源(PPT课件讲稿)Excel 技术基础、数据管理.ppt
- 南京航空航天大学:《C++程序设计》课程教学资源(PPT课件)第1章 C++程序设计基础(主讲:陈哲).ppt
- 《计算机组成原理》课程教学资源(PPT课件讲稿)第6章 总线结构.ppt
- 四川大学:Object-Oriented Design and Programming(Java,PPT课件)Advanced Class Design.ppt
- 香港科技大学:Latent Tree Models Part III:Learning Algorithms.pptx
- 《多媒体教学软件设计》课程教学资源(PPT课件讲稿)第3章 多媒体教学软件开发平台(Authorware).ppt
- 河南中医药大学(河南中医学院):《网络技术实训》课程教学资源(PPT课件讲稿)第9讲 通过VPN访问企业网内部服务器设计讨论.pptx
- 四川大学:《操作系统 Operating System》课程教学资源(PPT课件讲稿)Chapter 2 Operating System Overview.ppt
- 《数据结构 Data Structure》课程教学资源(PPT课件讲稿)第三章 栈和队列.ppt
- IS6000 – Seminar 8 Research Methods – Case Study – Action Research.pptx
- 《编译原理》课程教学资源(PPT课件讲稿)上下文无关文法——自顶向下分析.pptx