![]()
![]()
![]()
![]()
Using CLUSTAL on the Alpha
CLUSTAL is another multiple alignment program which is available on the RCR's Alpha server. It is superior to PILEUP in several ways
Thompson, J.D., Higgins, D.G. and Gibson, T.J.1994, CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680]
CLUSTAL is not part of GCG but it can work with GCG files
- it can use GCG formatted sequence files as input
- it produces output in the MSF format
CLUSTAL is also available as a stand-alone program for Macintosh and Windows computers.
Sequences are input into CLUSTAL from a multi-sequence FASTA file
The GCG program TOFASTA can convert lists of file names created by LOOKUP, by FASTA, or hand edited by the user into this FASTA format. Remember to use the "@" character before the name of the list file.
FASTA files can also be created by pasting individual sequences into a text editor. Include a line beginning with the ">" character followed by a name at the beginning of each sequence.
PILEUP uses a global alignment algorithm (the GCG GAP program) while CLUSTAL uses a local alignment method (similar to the GCG BESTFIT program).
The local alignment method can be an advantage for aligning highly diverged sequences or for genes that share some regions of homology, but are dissimilar in other regions.
Other features of CLUSTAL:
it can use a rapid approximate alignment method (FASTA) or the slower more accurate Smith-Waterman method
it can add individual sequences to an existing alignment or to align two groups of pre-aligned sequences with each other
it can re-align selected sequences or selected regions of the alignment leaving the unselected portions of the alignment constant
penalties for inserting gaps into aligned sequences can be adjusted based on specific amino acid residues, regions of hydrophobicity, proximity to other gaps, or based on secondary structure
CLUSTAL also has a function to compute phylogenetic trees from a set of aligned sequences.
![]()
![]()
![]()
![]()
Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center Comments to: browns02@mcrcr.med.nyu.edu