麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 3 More Multiple Sequence Alignment

791- Lecture #3 Michael Yaffe More Multiple Sequence Alignment and Motif Scanning, Database Searching R 0导中o998:求8 Computed _iing f and mmkelago(Schneider Stephe ns, 199o
7.91 – Lecture #3 More Multiple Sequence Alignment -andMotif Scanning, Database Searching Michael Yaffe

Outline Multiple Sequence Alignment -Carillo Lipman Clustal(W Position-Specific Scoring Matrices(PSSM) Information content, Shannon entropy Sequence logos Hidden markov models Other approaches: Genetic algorithms expectation maximization MEME, Gibbs sampler FASTA, Blast searching, Smith-Waterman ·Psi- Blast Reading- Mount p.139-150,152157,161-171,185-198
Outline • Multiple Sequence Alignment - Carillo & Lipman, Clustal(W) • Position-Specific Scoring Matrices (PSSM) • Information content, Shannon entropy • Sequence logos • Hidden Markov Models • …Other approaches: Genetic algorithms, expectation maximization, MEME,Gibbs sampler • FASTA, Blast searching, Smith-Waterman • Psi-Blast Reading - Mount p. 139-150, 152-157, 161-171, 185-198

Multiple Sequence Alignments Sequences are aligned so as to bring the greatest number of single characters into register If we include gaps, mismatches then even dynamic programming becomes limited to 3 sequences unless they are very short.. need an alternative approach Why?
Multiple Sequence Alignments • Sequences are aligned so as to bring the greatest number of single characters into register. • If we include gaps, mismatches, then even dynamic programming becomes limited to ~ 3 sequences unless they are very short….need an alternative approach… Why?

Consider the 2 sequence comparison an o(mn) problem-order n2 i=01 Gap V c 32 j0123456 04-8-16-24 -3:8 8…44-4→12→-20→28 16-671→-9→17 241469 3222-14 30 0-30-22 133 48-38-30-15 23
Consider the 2 sequence comparison …..an O(mn) problem – order n2 i =0 1 2 3 4 5 j = Gap V D S C Y 0 0 4 -8 -4 -3 -8 -16 -24 -32 -40 -8 1 -8 4 -12 -20 -28 2 -16 -6 7 2 -1 -9 -17 3 -24 -14 1 -7 3 -6 9 4 -32 -22 -14 -30 3 0 5 -40 -22 -7 1 13 3 6 -48 -38 -30 -15 5 23

For 3 sequences. ARDE SHGLLENKLLGCDSMRWE GRDYKMALLEOWILGCD-MRWD SRDW--ALIEDCMV-CNEFRWD An o(mni problem! Consider sequences each 300 amino acids Uh oh !!! 2 sequences-( 300)2 our polynomail problem exponen 3 sequences-()3 but for sequences- 300)v
For 3 sequences…. ARDFSHGLLENKLLGCDSMRWE .::. .:::. .:::: :::. GRDYKMALLEQWILGCD-MRWD .::. ::.: .. :. .::: SRDW--ALIEDCMV-CNFFRWD An O(mnj) problem ! Consider sequences each 300 amino acids Uh Oh !!! 2 sequences – (300)2 Our polynomail problem 3 sequences – (300)3 Just became exponential! but for v sequences – (300)v

Consider pairwise alignments between 3 sequences Carillo and lipman-sum of Pairs method A-B-C Do we need to Score each node? moco o B-C A-B A-C Sequence A Get the multiple alignment score within the cubic lattice by Adding together the scores of the pairwise alignments
Consider pairwise alignments between 3 sequences Carillo and Lipman – Sum of Pairs method Sequence B CecneuqeS A-C A-B B-C A-B-C Do we need to Score each node? Sequence A Get the multiple alignment score within the cubic lattice by Adding together the scores of the pairwise alignments…

In practice, doesn't give optimal alignment. But were close Seems reasonable that the optimal alignment won't be far from the diagonal we were on, so we just set bounds on the location of the msa within the cube based on each pairwise-alignment Then just do dynamic programing within the volume defined by the pre-imposed bounds
In practice, doesn’t give optimal alignment…But we’re close! Seems reasonable that the optimal alignment won’t be far from the diagonal we were on…so we just set bounds on the location of the msa within the cube based on each pairwise-alignment. Then just do dynamic programing within the volume defined by the pre-imposed bounds

the volume is broken into polyhedra and the borders of the polyhedra are defined by paths through possible alignments

Still takes too long for more than three sequences. need a better way! Progressive Methods of Multiple Sequence Alignment Concept- simple: 1-Use DP to build pairwise alignments of most closely related sequences 2-Then progressively add less related sequences or groups of sequences
Still takes too long for more than three sequences…need a better way! • Progressive Methods of Multiple Sequence Alignment Concept – simple: 1-Use DP to build pairwise alignments of most closely related sequences 2- Then progressively add less related sequences or groups of sequences…

Clustalw Higgins and Sharp 1988 1-Do pairwise analysis of all the sequences (you choose similarity matrix) 2Use the alignment scores to make a phylogenetic tree 3- Align the sequences to each other guided by the phylogenetic relationships in the tree New features: Clustal E ClustalW (allows weights)L ClustalX(GUI-based Weighting is important to avoid biasing an alignment by many sequence Members that are closely related to each other evolutionarily
ClustalW Higgins and Sharp 1988 • 1- Do pairwise analysis of all the sequences (you choose similarity matrix). • 2- Use the alignment scores to make a phylogenetic tree. • 3- Align the sequences to each other guided by the phylogenetic relationships in the tree. New features: Clustal ⌦ClustalW (allows weights) ⌦ ClustalX (GUI-based Weighting is important to avoid biasing an alignment by many sequence Members that are closely related to each other evolutionarily!
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 1 Michael Yaffe Introduction to Bioinformatics.pdf
- 《微生物遗传学》第四章 基因工程技术在改进微生物.ppt
- 《分子生物学》课程教学资源(练习题)试题详解(含参考答案).doc
- 南京军区南京总医院:《组织芯片应用的现状与前景》讲义.pdf
- 《酶学》课程教学资源(讲义)第四章 酶的结构和功能.doc
- 《酶学》课程教学资源(讲义)第十一章 酶在医学方面的应用.doc
- 《酶学》课程教学资源(讲义)第六章 多种因素对酶反应速度的影响.doc
- 《酶学》课程教学资源(讲义)第八章 酶的别构效应.doc
- 《酶学》课程教学资源(讲义)第五章 酶催化动力学基础.doc
- 《酶学》课程教学资源(讲义)第二章 酶的一般性质和分类.doc
- 《酶学》课程教学资源(讲义)第九章 固定化生物催化剂.doc
- 《酶学》课程教学资源(讲义)第三章 酶活性的测定及分离纯化.doc
- 《酶学》课程教学资源(讲义)第七章 多底物酶反应动力学.doc
- 《酶学》课程教学资源(讲义)第一章 绪论.doc
- 孝感学院:《植物解剖学》第四章 种子植物的繁殖和繁殖器官.ppt
- 孝感学院:《植物解剖学》第三讲 叶.ppt
- 孝感学院:《植物解剖学》第一讲 花的重要组成结构及形成.ppt
- 《现代生物学导论》课程教学资源(PPT课件)第四章 生命的基本化学组成.ppt
- 《现代生物学导论》课程教学资源(PPT课件)第三章 细胞.ppt
- 《现代生物学导论》课程教学资源(PPT课件)第二章 生物的多样性及其分类代表.ppt
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 2 More Pairwise Sequence Comparisons.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 4 Database Searching.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 5 Molecular Phylogenetics.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 2 The Language of genomics.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 1 Genome Sequencing.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 3 Review of DNA Seq.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 6 Predicting rna Secondary structure.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 4 Organization of topics.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 6 Structure Prediction.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 5 Markov models.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 5 Review -Homology Modeling.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 1 Review of protein structure hierarchy.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 1 How are X-ray crystal structures.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 3 For a molecular simulation or model.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 2 Comparing protein Structures.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 7 The protein interactome.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 7 DNA Microarrays Clustering.pdf
- 麻省理工大学:《Foundations of Biology》课程教学资源(英文版)Lecture 6 Ab initio structure prediction.pdf
- 《植物与植物生理学》课程PPT教学课件(高职高专)第三章 植物的矿质营养.ppt
- 《植物与植物生理学》课程PPT教学课件(高职高专)第二章 植物的水分代谢.ppt