《高级人工智能 Advanced Artificial Intelligence》教学资源(PPT讲稿)Lecture 7 Recurrent Neural Network

Advanced artificial Intelligence Lecture: Recurrent neural Network
Advanced Artificial Intelligence Lecture 7: Recurrent Neural Network

Outline Recurrent neural Network Vanilla rnns Some rnn variants Backpropagation through time Gradient Vanishing/Exploding Long short-term Memory LSTM Neuron Multiple-layer LSTM Backpropagation through time in LStm Time-Series Prediction
Outline ▪ Recurrent Neural Network ▪ Vanilla RNNs ▪ Some RNN Variants ▪ Backpropagation through time ▪ Gradient Vanishing / Exploding ▪ Long Short-term Memory ▪ LSTM Neuron ▪ Multiple-layer LSTM ▪ Backpropagation through time in LSTM ▪ Time-Series Prediction

Vanilla rnns Sequential data So far, we assume that data points(x, y)'s in a dataset are i.i. d (independent and identically distributed) Does not hold in many applications Sequential data: data points come in order and successive points may be dependent, e.g Letters in a word Words in a sentence/document Phonemes in a spoken word utterance Page clicks in a Web session Frames in a video, etc
Vanilla RNNs ▪ Sequential data So far, we assume that data points (x, y)’s in a dataset are i.i.d (independent and identically distributed) Does not hold in many applications Sequential data: data points come in order and successive points may be dependent, e.g., Letters in a word Words in a sentence/document Phonemes in a spoken word utterance Page clicks in a Web session Frames in a video, etc

Vanilla rnns Sequence Modeling How to model sequential data? Recurrent neural networks(vanilla rNNs) c(t depends on x(1),.x(t) Output all t)depends on hidden activations a+.) LtD) (Bias term omitted (k1) act(( )(k -1)+w(k)a(k-1 x a( summarizes x(…,x1) Earlier points are less important Sourceofslidehttps://ww.youtubecom/watch?v2btuy-fw3c&list=plipcwhqlgjdkvoozhmqswxlja9xw7osok
Vanilla RNNs ▪ Sequence Modeling How to model sequential data? Recurrent neural networks (vanilla RNNs): C(t) depends on x(1) ,··· ,x(t) Output a (L,t) depends on hidden activations: (Bias term omitted) a (·,t) summarizes x(t) ,··· ,x(1) Earlier points are less important Source of slide: https://www.youtube.com/watch?v=2btuy_-Fw3c&list=PLlPcwHqLqJDkVO0zHMqswX1jA9Xw7OSOK

Vanilla rnns Sequence Modeling a(k, =act(zk, t) =act(U(a4-1)+Wa(k-1) Weights are shared across time instances(W(k) Assumes that the“ transition functions”are time invariant(U(k) Our goal is to learn U(k)and W(k) for k=1,. L Sourceofslidehttps://ww.youtubecom/watch?v2btuy-fw3c&list=plipcwhqlgjdkvoozhmqswxlja9xw7osok
Vanilla RNNs ▪ Sequence Modeling Weights are shared across time instances (W(k) ) Assumes that the “transition functions” are time invariant (U(k) ) Our goal is to learn U(k) and W(k) for k = 1,···,L Source of slide: https://www.youtube.com/watch?v=2btuy_-Fw3c&list=PLlPcwHqLqJDkVO0zHMqswX1jA9Xw7OSOK

Vanilla rnns RNNs have Memory The computational graph of an rnn can be folded in time Black squares denotes memory access C (Ct-1) C (cr+) a(0 a(4+1) Unfold H (2) (2) awi anl, mAy a(a、UFa) x x(-) x(0 x(+1)
Vanilla RNNs ▪ RNNs have Memory The computational graph of an RNN can be folded in time Black squares denotes memory access

Vanilla rnns EXample Application Slot-Filling Spoken Language Understanding) I would like to arrive Shenyang on November 2nd ticket booking system Destination Shenyang Slot time of arrival: November 2nd Sourceofslidehttp://speech.ee.ntu.edu.tw_/-tilkagk/coursesMl16.html
Vanilla RNNs ▪ Example Application ▪ Slot-Filling (Spoken Language Understanding) I would like to arrive Shenyang on November 2nd . ticket booking system Destination: time of arrival: Shenyang Slot November 2nd Source of slide: http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.html

Vanilla rnns EXample Application Slot-Filling Spoken Language Understanding) Solving slot filling by Feedforward network? Input: a word (Each word is represented as a vector) Shenyang→ x Sourceofslidehttp://speech.ee.ntu.edu.tw/-tlkagk/coursesMl16.html
Vanilla RNNs ▪ Example Application ▪ Slot-Filling (Spoken Language Understanding) Solving slot filling by Feedforward network? Input: a word (Each word is represented as a vector) 1 x 2 x 2 y 1 y Shenyang Source of slide: http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.html

Vanilla rnns EXample Application 1-of-N encoding How to represent each word as a vector? 1-of-N Encoding lexicon =(apple, bag, cat, dog, elephant The vector is lexicon size apple =[1000 0 Each dimension corresponds to a bag=[01000] word in the lexicon cat=[00100] The dimension for the word is 1, and others are o dog=[00010] elephant =[0000 1 Sourceofslidehttp://speech.ee.ntu.edu.tw/-tlkagk/coursesMl16.hTml
Vanilla RNNs ▪ Example Application ▪ 1-of-N encoding How to represent each word as a vector? Each dimension corresponds to a word in the lexicon The dimension for the word is 1, and others are 0 lexicon = {apple, bag, cat, dog, elephant} apple = [ 1 0 0 0 0] bag = [ 0 1 0 0 0] cat = [ 0 0 1 0 0] dog = [ 0 0 0 1 0] elephant = [ 0 0 0 0 1] The vector is lexicon size. 1-of-N Encoding Source of slide: http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.html

Vanilla rnns EXample Application Beyond 1-of-N encoding Dimension for“ Other Word hashing apple a-a-a bag a-a-b 0 cat 0000 a-p-p dog 26X26X26 o-|-e eternal p-p other w=apple W=“ Gandalf w=“ Sauron Sourceofslidehttp://speech.ee.ntu.edu.tw_/-tikagk/coursesMl16.html
Vanilla RNNs ▪ Example Application ▪ Beyond 1-of-N encoding w = “apple” a-a-a a-a-b p-p-l 26 X 26 X 26 … … a-p-p … p-l-e … … … … … 1 1 1 0 0 Dimension for “Other” Word hashing w = “Sauron” … apple bag cat dog elephant “other” 0 0 0 0 0 1 w = “Gandalf” Source of slide: http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.html
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 西安交通大学:《网络与信息安全》课程PPT教学课件(网络入侵与防范)第六章 网络入侵与防范——拒绝服务攻击与防御技术.ppt
- 西安电子科技大学:《计算机通信网》课程教学资源(PPT课件讲稿)第1章 概述(宋锐).ppt
- 中国科学技术大学:《嵌入式操作系统 Embedded Operating Systems》课程教学资源(PPT课件讲稿)第四讲 CPU调度(part II).ppt
- 大数据集成(PPT讲稿)Big Data Integration.pptx
- 《计算机文化基础》课程教学资源(PPT课件讲稿)第七章 计算机网络基础.ppt
- 《计算机应用基础》课程教学资源(PPT课件讲稿)第四章 电子表格软件(Excel 2003).ppt
- 四川大学:《操作系统 Operating System》课程教学资源(PPT课件讲稿)Chapter 3 Process Description and Control 3.1 What is a Process 3.2 Process States 3.3 Process Description.ppt
- 哈尔滨工业大学:《语言信息处理》课程教学资源(PPT课件讲稿)机器翻译 II Machine Translation II.ppt
- Gas Systems Modeling andSimulation with MSC.EASY5:GD Advanced Class Notes(EAS105 Course Notes).ppt
- 《计算机网络 Computer Networking》课程教学资源(PPT课件讲稿,英文版)Chapter 6 Wireless and Mobile Networks.ppt
- 《图像处理与计算机视觉 Image Processing and Computer Vision》课程教学资源(PPT课件讲稿)Chapter 08 Stereo vision.pptx
- 《计算机文化基础》课程教学大纲 Computer Culture Foundation.pdf
- 《高级语言程序设计》课程教学资源(试卷习题)试题五(无答案).doc
- 大连工业大学:《计算机程序设计(C语言版)》课程教学资源(PPT课件讲稿,共十三章).pps
- 《Visual Basic 6.0程序设计》课程教学资源(PPT课件)第四章 常用控件与窗体.ppt
- 厦门大学:《大数据技术原理与应用》课程教学资源(PPT课件讲稿,2017)第11章 图计算.ppt
- 《计算机导论》课程教学资源(PPT课件讲稿)第9章 计算机学科方法论.ppt
- VB.Net程序设计基础(PPT课件讲稿).ppt
- 《计算机网络》课程教学资源(PPT课件)第4讲 以太网组网及故障排除.ppt
- 《编译原理》课程教学资源(PPT课件讲稿)第二章 词法分析.ppt
- 南京大学:《编译原理》课程教学资源(PPT课件讲稿)第七章 运行时刻环境.ppt
- 中国科学技术大学:《计算机体系结构》课程教学资源(PPT课件讲稿)第6章 Data-Level Parallelism in Vector, SIMD, and GPU Architectures.ppt
- 河南中医药大学(河南中医学院):《计算机网络》课程教学资源(PPT课件讲稿)第六章 应用层.pptx
- 媒体服务(PPT课件讲稿)Media Services.ppt
- 东北大学:《可信计算基础》课程教学资源(PPT课件讲稿)第6章 TPM核心功能(主讲:周福才).pptx
- 山东大学:《人机交互技术》课程教学资源(PPT课件讲稿)第3章 交互设备 3.5 显示设备 3.6 语音交互设备 3.7虚拟现实系统中的交互设备.ppt
- 《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源(PPT讲稿)Lecture 11 Probabilistic Information Retrieval.ppt
- 广西医科大学:《计算机网络 Computer Networking》课程教学资源(PPT课件讲稿)Chapter 01 Introduction overview.pptx
- 东南大学:《C++语言程序设计》课程教学资源(PPT课件讲稿)Chapter 10 Classes A Deeper Look(Part 2).ppt
- 《网上开店实务》课程教学资源(PPT讲稿)学习情境1 网上开店创业策划.ppt
- 安徽理工大学:《Linux开发基础 Development Foundation on Linux OS》课程资源(PPT课件讲稿)Section 4 Perl programming(赵宝).ppt
- 香港理工大学:Artificial Neural Networks for Data Mining.ppt
- 《TCP/IP协议及其应用》课程教学资源(PPT课件)第1章 TCP/IP协议基础.ppt
- 清华大学:《高级计算机网络 Advanced Computer Network》课程教学资源(PPT课件讲稿)Lecture 1 Introduction.pptx
- 香港浸会大学:C++ as a Better C; Introducing Object Technology.ppt
- 大庆职业学院:《计算机网络技术基础》课程教学资源(PPT课件讲稿)第2章 数据通信的基础知识.ppt
- The Art of Function Design -Measure and RKHS.ppt
- 《计算机网络与因特网》课程教学资源(PPT课件)Part VII 广域网(简称WAN), 路由, 和最短路径.ppt
- 三维计算机视觉 3D computer vision(基于卡尔曼滤波的运动结构).pptx
- 河南中医药大学(河南中医学院):《计算机文化》课程教学资源(PPT课件讲稿)第七章 数据库技术(主讲:王哲).pptx