Welcome to Research Computing Resource

Key Publications in Bioinformatics
(a highly opinionated collection)

Altschul, S. F. (1989) Gap Costs for Multiple Sequence Alignment
J Theoretical Biol 138:297-309

Altschul, S. F., Gish, W., Miller, W., Myers, E. W. and Lipman, D. J. (1990).
A Basic Local Alignment Search Tool.
J. Mol. Biol., 215, 403-410.

Altschul, S.F. (1991) Amino acid substitution matrices from an information theoretic perspective.
J. Mol. Biol. 219:555

Altschul, S.F., Boguski, M.S., Gish, W. and Wootton, J.C. (1994).
Issues in searching molecular sequence databases.
Nat Genet 6 (2), 119-29.

P. Baldi, Y. Chauvin, T. Hunkapiller, M. A. McClure, (1994)
Hidden Markov models of biological primary sequence information.
Proc Nat Acad Sci USA 91:1059-1063

Barton, G.J., Sternberg, M. E. J. A
Strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons.
J Mol Biol 1987; 198:327-337

Barton, G. J. 1990. Protein multiple sequence alignment and flexible pattern matching.
Meth Enzymol 183:403-427.

Barton G.J. (1997) Protein Sequence Alignment and Database Scanning. in Protein Structure prediction, a practical approach, M.J.E. Sternberg (ed),

Collins J.F. and Coulson A.F. (1990) Significance of protein sequence similarities.
Methods Enzymol. 183: 474-87

Davison D. (1985) Sequence similarity ('homology') searching for molecular biologists.
Bull. Math. Biol. 47:437-474.

Dayhoff, M.O., Schwartz, R.M., and Orcutt, B.C. (1978).
A model of evolutionary change in proteins. In Atlas of Protein Sequence and Function,
Nat. Biomed. Research Foundation, Vol. 5, 345-352.

Dayhoff, M.O. , Barker, W.C. and Hunt, L.T. (1983)
Establishing homologies in protein sequences.
Meth. Enzymol. 91:524

Doolittle, RF. Similar amino acid sequences: chance or common ancestry.
Science 214:149-59 (1981)

Doolittle, R.F. (1990). Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences (1 ed.).
Methods in Enzymology Volume 183, New York: Academic Press.

Doolittle, R.F. (1994). Protein sequence comparisons: searching databases and aligning sequences.
Curr Opin Biotechnol 5 (1), 24-8.

Eddy, S.R. Multiple Alignment using Hidden Markov Models. in: Proc.
Intelligent Systems for Molecular Biology (Edited by: C. Rawlings et al.) AAAI Press 1995;0:114-120

Eddy, S.R. (1995). Multiple alignment using hidden Markov models. Ismb, 3, 114-20.

Felsenstein, J. (1988) Phylogenies from molecular sequences: inferences and reliability.
Annu. Rev. Genet. 22:521-565.

Feng, D.F., Doolittle, R.F. (1987)
Progressive Sequence Alignment as a Prerequisite to Correct Phylogenetic Trees.
J Mol Evol 25:351-360

Feng, D. F. and Doolittle, R. F. (1996).
Progressive Alignment of Amino Acid Sequences and Construction of Phylogenetic Trees from Them.
Methods in Enzymology, 266, 368-382.

Fitch W.M. and Smith T.F. (1983) Optimal sequence alignments.
Proc. Natl. Acad. Sci. (USA) 80:1382-1386.

Gribskov, M., McLachlan, A. D. and Eisenberg, D. (1987).
Profile analysis: Dectection of distantly related proteins.
Proc. Natl. Acad. Sci. USA 84, 4355-4358.

Higgins D.G., Bleasby A.J., Fuchs R. (1992)
CLUSTAL V: improved software for multiple sequence alignment.
Comput. Appl. Biosci. 8:189-191.

Henikoff, S., Greene, E. A., Pietrokovski, S., Bork, P., Attwood, T.K. and Hood, L. (1997).
Gene Families: The taxonomy of protein paralogs and chimeras.
Science, 278 (24 October 1997), 609-614.

Henikoff, S. And Henikoff, J.G. (1991)
Automated assembly of protein blocks for database searching.
Nucleic Acids Res., 19:6565-6572.

Henikoff S and Henikoff J.G. (1992).
Amino acid substitution matrices from protein blocks.
Proc. Natl. Acad. Sci. USA 89:10915-10919.

Henikoff S. and Henikoff J. G. (1993)
Performance Evaluation of Amino Acid Substitution Matrices.
Proteins 17:49

Henikoff, S. Comparative sequence analysis: Finding genes.
In Biocomputing, Informatics and Genome Projects. D.W. Smith ed. (1994). pp 87-117

Kwok PY, Deng Q, Zakeri H, Taylor SL, Nickerson DA, (1996)
Increasing the information content of STS-based genome maps: identifying polymorphisms in mapped STSs. Genomics 31(1):123-6

Lipman D.J., Wilbur W.J., Smith T.F. and Waterman M.S. (1984)
On the statistical significance of nucleic acid similarities.
Nucl. Acids Res. 12:215-226.

Lipman, D.J. and Pearson, W.R. (1985).
Rapid and Sensitive Protein Simlarity Searches.
Science 227, 1435-1441.

Livingstone, C.D., & Barton, G.J. (1996).
Identification of functional residues and secondary structure from protein multiple sequence alignment.
Methods Enzymol, 266:497-512.

Needleman S.B. and Wunsch C.D. (1970)
A general method applicable to the search for similarities in the amino acid sequence of two proteins.
J. Mol. Biol. 48:443-453.

Pearson, W.R. and Lipman, D.J. (1988).
Improved tools for biological sequence comparison.
Proc. Natl. Acad. Sci USA 85:2444-2448.

Pearson, W.R. (1986). Sensitivity and Selectivity in Protein Sequence Comparison.
In Methods in Protein Sequence Analysis, Clifton, New Jersey: Humana Press.

Pearson, W.R. (1994). Using the FASTA program to search protein and DNA sequence databases.
Methods Mol Biol 25 , 365-89.

Risler J.L., Delorme M.O., Delacroix H., and Henaut A. (1988) 1996 Amino acid substitutions in structurally related proteins - a pattern recognition approach.
J. Mol. Biol. 204:1019-1029.

Saitou and Nei (1987) 1996 The neighbor-joining method: a new method for reconstructing phylogenetic trees.
Mol. Biol. Evol. 4:406-425

Smith T.F. and Waterman M.S. (1981)
Comparison of biosequences. Adv. in Applied Math. 2:482-489 .

Smith T.F. and Waterman M.S. (1981)
Identification of common molecular subsequences.
J. Mol. Biol. 147:195-197.

Sneath, P.H.A. and Sokal, R.R. (1973)
in Numerical Taxonomy (pp; 230-234), W.H. Freeman and Company, San Francisco, California, USA)

Sonnhammer E.L., Eddy S.R., Durbin R. (1997)
Pfam: a comprehensive database of protein domain families based on seed alignments.
roteins Jul;28(3):405-420

Staden, R. (1994). Staden: using patterns to analyze protein sequences.
Methods Mol Biol, 25, 141-54.

States D.J., Gish W. and Altschul S.F. (1991)
Improved sensitivity of nucleic acid database searches using application-specific scoring matrices.
Methods 3:66

Taillon-Miller P, Gu Z, Li Q, Hillier L, Kwok PY., (1998)
Overlapping genomic sequences: A treasure trove of single-nucleotide polymorphisms.
Genome Res 8(7):748-54

Tatusov, R. L., Koonin, E. V. and Lipman, D. J. (1997).
A Genomic Perspective of Protein Families.
Science, 278(24 October), 631-637.

Taylor, W. R. (1987) Multiple Sequence Alignment by a Pairwise Algorithm.
Comput Appl Biosci 3(2):81-87

Thompson, J., Higgins, D.G., Gibson, T.J. (1994)
Clustal W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
Nucleic Acids Res 22:4673-4680

Wang et. al. (1998)
Large Scale Identification, Mapping and Genotyping of Single-Nucleotide Polymorphism in the Human Genome.
Science 280:1077-1082.

Wilbur, W.J. and Lipman, D.J. (1983).
Rapid similarity searches of nucleic acid and protein data banks.
Proc. Natl. Acad. Sci. USA 80, 726-30.

Zuckerkandl E, Pauling L. (1965)
Molecules as documents of evolutionary history J Theor Biol 8(2):357-366]