《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data Retrieval and Mining(南京大学:李武军)

Learning to Hash with its Application to Big Data Retrieval and Mining 李武军 Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai,China Joint work with孔维吴,张东擎,过敏意 Dec21,2013 日卡三4元,互Q0 i (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE,SJTU 1/49
Learning to Hash with its Application to Big Data Retrieval and Mining o… Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai, China Joint work with öëh, ‹¿ô, LØø Dec 21, 2013 Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 1 / 49

Outline Introduction Problem Definition ●Existing Methods ② Isotropic Hashing Model ●Learning o Experiment Multiple-Bit Quantization Double-Bit Quantization Manhattan Quantization ④ Conclusion Reference +日卡+得三元互)Q0¥ Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE,SJTU 2 /49
Outline 1 Introduction Problem Definition Existing Methods 2 Isotropic Hashing Model Learning Experiment 3 Multiple-Bit Quantization Double-Bit Quantization Manhattan Quantization 4 Conclusion 5 Reference Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 2 / 49

Introduction Outline Introduction Problem Definition ●Existing Methods Isotropic Hashing o Model e Learning o Experiment Multiple-Bit Quantization Double-Bit Quantization o Manhattan Quantization Conclusion Reference +日卡+得三元互)Q0¥ Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE,SJTU 3 /49
Introduction Outline 1 Introduction Problem Definition Existing Methods 2 Isotropic Hashing Model Learning Experiment 3 Multiple-Bit Quantization Double-Bit Quantization Manhattan Quantization 4 Conclusion 5 Reference Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 3 / 49

Introduction Problem Definition Nearest Neighbor Search(Retrieval) oGiven a query point g,return the points closest(similar)to g in the database(e.g.images). o Underlying many machine learning,data mining,information retrieval problems Challenge in Big Data Applications: o Curse of dimensionality Storage cost ●Query speed 日卡三4元,互Q0 Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE,SJTU 4 /49
Introduction Problem Definition Nearest Neighbor Search (Retrieval) Given a query point q, return the points closest (similar) to q in the database(e.g. images). Underlying many machine learning, data mining, information retrieval problems Challenge in Big Data Applications: Curse of dimensionality Storage cost Query speed Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 4 / 49

Introduction Problem Definition Similarity Preserving Hashing h(Statue of Liberty)= h(Napoleon)= h (Napoleon)= 10001010 01100001 011001Q1 flipped bit Should be very different Should be similar 0Q0 f(http:/aw.cs:sjtu.edu.cn/-1iu时jun】 Learning to Hash CSE,SJTU 5/49
Introduction Problem Definition Similarity Preserving Hashing Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 5 / 49

Introduction Problem Definition Reduce Dimensionality and Storage Cost Gist vector Binary reduction 1 million images 2 GB 16 MB 口卡+得二4元互)Q0 Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE.SJTU 6/49
Introduction Problem Definition Reduce Dimensionality and Storage Cost Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 6 / 49

Introduction Problem Definition Querying Hamming distance: 。101101110,00101101la=3 。l11011,01011lg=1 Query Image Dataset ,30Q0 Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CsE,S」TU7/49
Introduction Problem Definition Querying Hamming distance: ||01101110, 00101101||H = 3 ||11011, 01011||H = 1 Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 7 / 49

Introduction Problem Definition Querying 是 口卡得三4元互Q0 Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE,SJTU 8/49
Introduction Problem Definition Querying Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 8 / 49

Introduction Problem Definition Querying 口卡得三4元互Q0 Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE,SJTU 9/49
Introduction Problem Definition Querying Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 9 / 49

Introduction Problem Definition Fast Query Speed o By using hashing scheme,we can achieve constant or sub-linear search time complexity. Exhaustive search is also acceptable because the distance calculation cost is cheap now. 日卡三4元,互Q0 Li (http://www.cs.sjtu.edu.cn/-livujun Learning to Hash CSE.SJTU 10/49
Introduction Problem Definition Fast Query Speed By using hashing scheme, we can achieve constant or sub-linear search time complexity. Exhaustive search is also acceptable because the distance calculation cost is cheap now. Li (http://www.cs.sjtu.edu.cn/~liwujun) Learning to Hash CSE, SJTU 10 / 49
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data Retrieval and Mining(南京大学:李武军).pdf
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Decidability, Complexity(P, NP, NPC and related).pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Timed Automata.ppt
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Petri Net.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Transition System.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Turing Machine.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Properties of CFL(The Pumping Lemma for CFL’s).pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Pushdown Automata.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Regular Expression.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Context Free Grammar.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Finite Automata.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Byzantine Generals Problem.ppt
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Use-after-free.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Taint Analysis.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Program Analysis - Data Flow Analysis.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Control Flow - Representation, Extraction and Applications.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Return-Orinted Programming(ROP Attack).ppt
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Format String Attacks.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Control Flow Integrity.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Redundant dynamic Canary.ppt
- 《大数据 Big Data》课程教学资源(参考文献)大数据机器学习 Big Data Machine Learning.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data.pdf
- 《大数据 Big Data》课程教学资源(参考文献)大数据机器学习 Big Data Machine Learning.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data - A Tutorial.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Parallel and Distributed Stochastic Learning - Towards Scalable Learning for Big Data Intelligence(南京大学:李武军).pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Coherence functions for multicategory margin-based classification methods.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Latent Wishart processes for relational kernel learning.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Latent Wishart processes for relational kernel learning(讲稿).pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)agiCoFi - Tag informed collaborative filtering.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Localized content-based image retrieval through evidence region identification.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Relation regularized matrix factorization.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Relation regularized matrix factorization(讲稿).pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Probabilistic relational PCA.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Gaussian process latent random field.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Multiple-instance learning via disambiguation.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Generalized latent factor models for social network analysis.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Social relations model for collaborative filtering.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Sparse probabilistic relational projection.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Emoticon smoothed language models for Twitter sentiment analysis.pdf