清华大学:Making Full Use of Chinese Speech Corpora(PPT讲稿)
O-COCOSDA. Oct. 1-3. 2003 Sentosa, singapore Making Full Use of Chinese speech Corpora Thomas Fang Zheng Center of speech Technology State Key laboratory of intelligent Technology and Systems Tsinghua University http://sp.cs.tsinghuaedu.cn, Beijing d-Ear Technologies Co. Ltd http://www.d-ear.com Oct.2,2003
Making Full Use of Chinese Speech Corpora Thomas Fang Zheng Center of Speech Technology State Key Laboratory of Intelligent Technology and Systems Tsinghua University http://sp.cs.tsinghua.edu.cn/ Beijing d-Ear Technologies Co., Ltd. http://www.d-Ear.com Oct. 2, 2003 O-COCOSDA, Oct. 1-3, 2003 Sentosa, Singapore
ecur 得意音通技术 2 Outline Your Partnerin the Century of Speech aPurpose of speech corpora U factors to be considered in data creation 日 Data creation 日 Data transcription ULearning from corpora aChinese Corpus Consortium(CCc)
Your Partner in the Century of Speech 2 Outline ❑Purpose of speech corpora ❑Factors to be considered in data creation ❑Data creation ❑Data transcription ❑Learning from corpora ❑Chinese Corpus Consortium (CCC)
ecur 得意音通技术 Purpose of Speech Corpora Your Partnerin the Century of Speech Item Description Percentage 1. Speech/ system development, evaluation, sentence 73% speaker comprehension and summarization, speech recognition recognition, speaker recognition 2. Speech system development, prosodic analysis 11% synthesis 3. Acoustic acoustic analysis, speech codin g 9% analVSiS 4. Sentence syntactic and semantic analysis 5% analysis 5. Speech/ speech and language education 2% language education
Your Partner in the Century of Speech 3 Purpose of Speech Corpora Item Description Percentage 1. Speech/ speaker recognition system development, evaluation, sentence comprehension and summarization, speech recognition, speaker recognition 73% 2. Speech synthesis system development, prosodic analysis 11% 3. Acoustic analysis acoustic analysis, speech coding 9% 4. Sentence analysis syntactic and semantic analysis 5% 5. Speech/ language education speech and language education 2%
ecur 得意音通技术 Outline Your Partnerin the Century of Speech PUrpose of speech corpora FActors to be considered in data creation 日 Data creation 日 Data transcription ULearning from corpora aChinese Corpus Consortium(CCc)
Your Partner in the Century of Speech 4 Outline ❑Purpose of speech corpora ❑Factors to be considered in data creation ❑Data creation ❑Data transcription ❑Learning from corpora ❑Chinese Corpus Consortium (CCC)
ecur 得意音通技术 5 Factors to be considered in data creation(1) Your Partnerin the Century of Speech 口 The language Language: e. g, Chinese or English i Dialectal background (e.g, for Chinese Putonghua or standard Chinese(普通话); Mandarin(官话, northern china Wu(xia, Southern Jiangsu, Zhejiang, and Shanghai Yue(ia, Guangdong, Hong Kong, Nanning Guangxi Min(闽南话, Fujian, Shantou guangdong, Haikou hainan, Taipei Taiwan kka(客家话, Meixian guangdong,Hsn- Chu Taiwan); Xiang(湘, Hunan); Gan(赣, Jiangxi; Hui(徽, Anhui;and Jn(晋, Shanxi ☆ Special for chinese: Simplified chinese Traditional chinese
Your Partner in the Century of Speech 5 Factors to be considered in data creation (1) ❑ The language. ❖ Language: e.g., Chinese or English ❖ Dialectal background (e.g., for Chinese) :- ▪ Putonghua or standard Chinese (普通话); ▪ Mandarin (官话,Northern China); ▪ Wu (吴语,Southern Jiangsu, Zhejiang, and Shanghai); ▪ Yue (粤语,Guangdong, Hong Kong, Nanning Guangxi); ▪ Min (闽南话,Fujian, Shantou Guangdong, Haikou Hainan, Taipei Taiwan); ▪ Hakka (客家话,Meixian Guangdong, Hsin-Chu Taiwan); ▪ Xiang (湘,Hunan); ▪ Gan (赣,Jiangxi); ▪ Hui (徽,Anhui); and ▪ Jin (晋,Shanxi). ❖ Special for Chinese :- ▪ Simplified Chinese ▪ Traditional Chinese
ecur 得意音通技术 6 Your Partner inthe Centum af snatch A中适 兰糖话 陶容话 e江官话 说明:本图《中国西喜 集(图A2) 言的 官话方言分布图
Your Partner in the Century of Speech 6
ecur 得意音通技术 Your Partnerinthe Century of speech 现代吴语方言分区图 江淮官话” 苏沪嘉小片 宣州片 灶 徽语 太湖片 州片 处衢」 福 瓯江片 建
Your Partner in the Century of Speech 7 太湖片 台 州 片 瓯江片 ? 处衢片 苏沪嘉小片 江淮官话 徽语 宣州片 杭州小片 林绍小片
ecur 得意音通技术 Factors to be considered in data creation(2) Your Partnerin the Century of Speech 日 Speaking style Read for asr in earlier research, or for Tts Spontaneous/ conversational: for ASR nowadays 口 Recording channel 8 Depending on goal of task or application, or the application environment Close-talk microphones: for personal computers(PCs) Telephone, and or cellular phone: for telephony applications Specific channel: for embedded applications(PDA, digital recorder, .) or broadcast news, TV news. Normally mono channel instead of stereo channel 4 However, microphone array may be used for some research purpose
Your Partner in the Century of Speech 8 Factors to be considered in data creation (2) ❑Speaking style :- ❖Read: for ASR in earlier research, or for TTS ❖Spontaneous/conversational: for ASR nowadays ❑Recording channel ❖Depending on goal of task or application, or the application environment ▪ Close-talk microphones: for personal computers (PCs) ▪ Telephone, and/or cellular phone: for telephony applications ▪ Specific channel: for embedded applications (PDA, digital recorder, ...), or broadcast news, TV news. ❖Normally mono channel instead of stereo channel. ❖However, microphone array may be used for some research purpose
ecur 得意音通技术 9 Factors to be considered in data creation (3) Your Partnerin the Century of Speech 口 Sampling rate: s8 kHz: for the telephone/ mobile-phone channel where the bandwidth is about 3. 4 khz 16 kHz: for the close-talk microphone PC channel though the bandwidth is higher than 8 kHz 日 Sampling precision: ☆16bits, normally. 88-bit A-law or Miu-law(13-bit wide after decompression) a Signal-to-Noise Ratio ( snr) level s Was/is often collected in a good environment (clean speech database For noise-related research, noisy data obtained via Noises(noiseX 92 )mixed with clean speech Collected in real-world noisy environments
Your Partner in the Century of Speech 9 Factors to be considered in data creation (3) ❑ Sampling rate :- ❖ 8 kHz: for the telephone/mobile-phone channel where the bandwidth is about 3.4 kHz ❖ 16 kHz: for the close-talk microphone PC channel though the bandwidth is higher than 8 kHz. ❑ Sampling precision :- ❖ 16 bits, normally. ❖ 8-bit A-law or Miu-law (13-bit wide after decompression). ❑ Signal-to-Noise Ratio (SNR) level: ❖ Was/is often collected in a good environment (clean speech database). ❖ For noise-related research, noisy data obtained via :- ▪ Noises (NOISEX 92) mixed with clean speech; ▪ Collected in real-world noisy environments
ecur 得意音通技术 10 Factors to be considered in data creation(4) Your Partnerin the Century of Speech U Number of speakers and speaker balance The more, the better: with a good speaker diversity according to Gender ge ■ Education Birthplace or dialectal background Occupation and so on 日 Corpus size: B Measured by either the number of speakers or the length of valid speech in hour, or both
Your Partner in the Century of Speech 10 Factors to be considered in data creation (4) ❑Number of speakers and Speaker balance: ❖The more, the better: with a good speaker diversity, according to :- ▪ Gender; ▪ Age; ▪ Education; ▪ Birthplace (or dialectal background); ▪ Occupation; ▪ and so on. ❑Corpus size: ❖Measured by either the number of speakers or the length of valid speech in hour, or both
- 兰州大学:数字图书馆技术的渊源进展和反思(PPT讲稿)Digital Libraries and the Future of Library Professions.ppt
- 湖北理工学院:《科研实践基础训练》课程教学资源(PPT讲稿)第二讲 科研选题方法.ppt
- 《创新与创业能力培养》课程教材配套电子教案(PPT教学课件,共九章,主编:冯丽霞、王若洪)创新与创业能力培养、职业生涯规划与体验.ppt
- 清华大学出版社:《职业教育与就业指导》课程教材电子教案(PPT课件讲稿,共九章,主编:邵海峡).ppt
- 北京理工大学:教育技术二级培训(PPT讲稿)教学设计.ppt
- 素质测评标准体系的构建(PPT讲稿)基于方法能力培养的学习.ppt
- 教育部科学技术委员会:高等学校科学技术学术规范指南(宣讲稿).ppt
- 华南理工大学:科研经费管理的背景与趋势及形势与政策(PPT讲稿).ppt
- 辽宁农业职业技术学院:2018年毕业生就业质量年度报告.pdf
- 初中物理新课程讲稿(PPT课件).ppt
- 南京晓庄学院:中学物理实验教学研究(PPT讲稿)绪论.ppt
- 昆明医学院第一附属医院:人的心理(PPT讲稿)普通心理学.ppt
- 《民俗学》课程教学资源:课程教学大纲.doc
- 福建医科大学:《普通心理学》课程教学资源(实验)实验三 记忆实验.pdf
- 北京师范大学:依托农远工程大力提高农村中小学教学质量促进教育均衡发展.ppt
- 东南大学:本科教学工作审核评估的内涵与实施(PPT讲稿).pptx
- 《大学生职业发展与就业指导》教学指南(PPT讲稿)第三讲 面试技巧和基本礼仪.ppt
- 大学生防范传销安全教育(PPT讲稿)保护自我、远离传销.pptx
- 华南师范大学:信息技术与学科课程整合的方法(PPT讲稿)教育中信息技术的应用.ppt
- 教育技术学科教育之反思(PPT讲稿,华东师范大学:祝智庭).ppt
- 《发展心理学》课程教学资源(PPT课件讲稿)第一章 绪论.ppt
- 贵州师范学院:《普通心理学》课程教学资源(PPT课件讲稿)第五章 知觉.ppt
- 高校教学研究(PPT讲稿)高等职业教育的课程与精品课程建设.ppt
- 中国医科大学:心理应激(PPT讲稿)Psychologicalstress.ppt
- 华北科技学院国家自然科学基金申请辅导报告:基金类项目申请——思路和套路(PPT讲稿).pptx
- 战略机遇期的中国高等教育(PPT讲稿)高等教育发展的宏观背景和政策走向.ppt
- 新时期大学的理念与管理(PPT讲稿).ppt
- 《教育技术学》课程教学资源(PPT课件)第4讲 教育技术学的理论基础(上).ppt
- 大连工业大学:优化学科和队伍结构、提升科研整体实力推动高质量发展(PPT讲稿).pptx
- 北京大学:大学治理(PPT讲稿)比较视角(教育学院:阎凤桥).pptx
- 喀什大学(喀什师范学院):班主任工作技能训练教程(PPT讲稿)中小学班主任工作技能培训教程.ppt
- 《班主任工作技能》课程教学大纲(适用专业:数学与应用数学).pdf
- 清华大学出版社:《心理健康与保健 mental health and health care》教材电子教案(PPT课件讲稿,共十章,主编:孙惠君、冯丽霞).ppt
- 华东师范大学:职业教育项目课程开发与实施(PPT讲稿,徐国庆).ppt
- 清华文化漫谈(PPT讲稿,冯务中).ppt
- 《普通心理学》课程电子教案(PPT课件讲稿)第六章 记忆.ppt
- 上海市教育科学研究院:数学教学的水平、方式与教师学习.ppt
- 《普通心理学》课程教学资源(PPT课件讲稿)第一章 绪论.ppt
- 《语文课程与教学论》课程教学资源(PPT课件讲稿,共十章)语文课程与教学论讲义.ppt
- 北京师范大学:教育研究方法讲座系列(PPT讲稿)Approach to Comparative-Historical Method(Functionalism in Comparative Perspective).ppt