next next next index

Formatting Multiple Alignments

*** The final product of a PILEUP or CLUSTAL run is a set of aligned sequences, which are stored in a Multiple Sequence File (called .msf by GCG).

*** This msf file is a text file that can be formatted with a text editor, but GCG has some dedicated tools for improving the looks of msf files for easier interpretation and for publication.

*** Consensus sequences can be calculated and the relationship of each character of each sequence to the consensus can be highlighted using the program PRETTY .

*** PRETTY takes the output of PILEUP as input but - and here is another silly GCG weirdness - you must follow the file name with an asterisk in curly brackets {*} like this:
$ PRETTY  myseqs.msf{*}
*** You will probably want to add some modifiers to the command line.

  • add /CON to create a consensus sequence

  • use /CASE to type characters that match the consensus in uppercase and others in lowercase

  • use /DIFF to type only those characters that do not match the consensus

*** So a typical command line for PRETTY would look like this:
$ PRETTY/CON/DIFF  myseqs.msf{*} 

*** Shading of regions of high homology can be created using the program BOXSHADE, but this is not actually part to the GCG package and it is a bit complex to use.

*** GCG has combined most of the functions of PRETTY and BOXSHADE into a new program called PRETTYBOX in GCG v.10. Just use the command PRETTYBOX instead of PRETTY to format a mutliple alignment file and you will get a fairly nice shaded alignment.
$ PRETTYBOX  myseqs.msf{*}
shaded alignmet

*** In addition to these programs that run on the Alpha, the output of PILEUP (or CLUSTAL) can be moved by FTP from your RCR account to a local Mac or PC.

*** Since this output is a plain text file, it can be edited with any word processing program, or imported into any drawing program to add boldface text, underlining, shading, boxes, arrows, etc.

*** There is also a Macintosh version of the BoxShade program called MacBox.


next next next index

Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center
Comments to: browns02@mcrcr.med.nyu.edu