《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data

Learning to Hash for Big Data 李武军 LAMDA Group 南京大学计算机科学与技术系 软件新技术国家重点实验室 liwujun@nju.edu.cn May09,2015 日卡三4元,互Q0 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA,CS.NJU 1/43
Learning to Hash for Big Data o… LAMDA Group HÆåÆOéÅâÆÜE‚X ^á#E‚I[:¢ø liwujun@nju.edu.cn May 09, 2015 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 1 / 43

Outline ① Introduction o Problem Definition oExisting Methods 2Scalable Graph Hashing with Feature Transformation Motivation ●Model and Learning o Experiment Conclusion ④ Reference 日卡回”三4元,互Q0 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA,CS.NJU 2/43
Outline 1 Introduction Problem Definition Existing Methods 2 Scalable Graph Hashing with Feature Transformation Motivation Model and Learning Experiment 3 Conclusion 4 Reference Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 2 / 43

Introduction Outline ① Introduction o Problem Definition oExisting Methods Scalable Graph Hashing with Feature Transformation o Motivation Model and Learning o Experiment Conclusion Reference +日卡+得¥三4元互)Q0 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA,CS.NJU 3 /43
Introduction Outline 1 Introduction Problem Definition Existing Methods 2 Scalable Graph Hashing with Feature Transformation Motivation Model and Learning Experiment 3 Conclusion 4 Reference Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 3 / 43

Introduction Problem Definition Nearest Neighbor Search(Retrieval) oGiven a query point g,return the points closest(similar)to g in the database (e.g.,images). o Underlying many machine learning,data mining,information retrieval problems Challenge in Big Data Applications: o Curse of dimensionality Storage cost ●Query speed 日卡三4元,互Q0 Li (http://cs.nju.edu.cn/lvj) Learning to Hash LAMDA,CS.NJU 4/43
Introduction Problem Definition Nearest Neighbor Search (Retrieval) Given a query point q, return the points closest (similar) to q in the database (e.g., images). Underlying many machine learning, data mining, information retrieval problems Challenge in Big Data Applications: Curse of dimensionality Storage cost Query speed Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 4 / 43

Introduction Problem Definition Similarity Preserving Hashing h(Statue of Liberty)= h(Napoleon)= h (Napoleon)= 10001010 01100001 011001Q1 flipped bit Should be very different Should be similar 0Q0 Li (http://cs.nju.edu.cn/lvj) Learning to Hash LAMDA.CS.NJU 5/43
Introduction Problem Definition Similarity Preserving Hashing Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 5 / 43

Introduction Problem Definition Reduce Dimensionality and Storage Cost Gist vector Binary reduction 10 million images 20 GB 160MB 口卡+得二4元互)Q0 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA,CS.NJU 6/43
Introduction Problem Definition Reduce Dimensionality and Storage Cost Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 6 / 43

Introduction Problem Definition Querying Hamming distance: 。101101110,00101101la=3 。l11011,01011lg=1 Query Image Dataset 50Q0 Li (http://cs.nju.edu.cn/lvj) Learning to Hash LAMDA.CS.NJU 7/43
Introduction Problem Definition Querying Hamming distance: ||01101110, 00101101||H = 3 ||11011, 01011||H = 1 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 7 / 43

Introduction Problem Definition Fast Query Speed o By using hashing-based index,we can achieve constant or sub-linear search time complexity. Exhaustive search is also acceptable because the distance calculation cost is cheap now. 日卡三4元,互Q0 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA,CS.NJU 8/43
Introduction Problem Definition Fast Query Speed By using hashing-based index, we can achieve constant or sub-linear search time complexity. Exhaustive search is also acceptable because the distance calculation cost is cheap now. Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 8 / 43

Introduction Problem Definition Hash Function Learning Easy or hard? Hard:discrete optimization problem Easy by approximation:two stages of hash function learning oProjection stage (dimensionality reduction) Projected with real-valued projection function Given a point x,each projected dimension i will be associated with a real-valued projection function fi(x)(e.g.fi(x)=wx) ●Quantization stage Turn real into binary However,there exist essential differences between metric learning(dimensionality reduction)and learning to hash.Simply adapting traditional metric learning is not enough. Li (http://cs.nju.edu.cn/lvj) Learning to Hash LAMDA.CS.NJU 9/43
Introduction Problem Definition Hash Function Learning Easy or hard? Hard: discrete optimization problem Easy by approximation: two stages of hash function learning Projection stage (dimensionality reduction) Projected with real-valued projection function Given a point x, each projected dimension i will be associated with a real-valued projection function fi(x) (e.g. fi(x) = wT i x) Quantization stage Turn real into binary However, there exist essential differences between metric learning (dimensionality reduction) and learning to hash. Simply adapting traditional metric learning is not enough. Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 9 / 43

Introduction Existing Methods Data-Independent Methods The hash function family is defined independently of the training dataset: Locality-sensitive hashing (LSH):(Gionis et al.,1999;Andoni and Indyk,2008)and its extensions (Datar et al.,2004;Kulis and Grauman,2009;Kulis et al.,2009). SIKH:Shift invariant kernel hashing (SIKH)(Raginsky and Lazebnik, 2009). Hash function:random projections. 日卡三4元,互Q0 Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA,CS.NJU 10/43
Introduction Existing Methods Data-Independent Methods The hash function family is defined independently of the training dataset: Locality-sensitive hashing (LSH): (Gionis et al., 1999; Andoni and Indyk, 2008) and its extensions (Datar et al., 2004; Kulis and Grauman, 2009; Kulis et al., 2009). SIKH: Shift invariant kernel hashing (SIKH) (Raginsky and Lazebnik, 2009). Hash function: random projections. Li (http://cs.nju.edu.cn/lwj) Learning to Hash LAMDA, CS, NJU 10 / 43
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data.pdf
- 《大数据 Big Data》课程教学资源(参考文献)大数据机器学习 Big Data Machine Learning.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data Retrieval and Mining(南京大学:李武军).pdf
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data Retrieval and Mining(南京大学:李武军).pdf
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Decidability, Complexity(P, NP, NPC and related).pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Timed Automata.ppt
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Petri Net.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Transition System.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Turing Machine.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Properties of CFL(The Pumping Lemma for CFL’s).pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Pushdown Automata.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Regular Expression.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Context Free Grammar.pptx
- 南京大学:《形式语言与自动机 Formal Languages and Automata》课程教学资源(PPT课件讲稿)Finite Automata.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Byzantine Generals Problem.ppt
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Use-after-free.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Taint Analysis.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Program Analysis - Data Flow Analysis.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Control Flow - Representation, Extraction and Applications.pptx
- 南京大学:《软件安全 Software Security》课程教学资源(PPT课件讲稿)Return-Orinted Programming(ROP Attack).ppt
- 《大数据 Big Data》课程教学资源(参考文献)大数据机器学习 Big Data Machine Learning.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Learning to Hash for Big Data - A Tutorial.pdf
- 《大数据 Big Data》课程教学资源(参考文献)Parallel and Distributed Stochastic Learning - Towards Scalable Learning for Big Data Intelligence(南京大学:李武军).pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Coherence functions for multicategory margin-based classification methods.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Latent Wishart processes for relational kernel learning.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Latent Wishart processes for relational kernel learning(讲稿).pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)agiCoFi - Tag informed collaborative filtering.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Localized content-based image retrieval through evidence region identification.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Relation regularized matrix factorization.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Relation regularized matrix factorization(讲稿).pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Probabilistic relational PCA.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Gaussian process latent random field.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Multiple-instance learning via disambiguation.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Generalized latent factor models for social network analysis.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Social relations model for collaborative filtering.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Sparse probabilistic relational projection.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Emoticon smoothed language models for Twitter sentiment analysis.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Double-bit quantization for hashing.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Manhattan hashing for large-scale image retrieval.pdf
- 《人工智能、机器学习与大数据》课程教学资源(参考文献)Isotropic hashing.pdf