![]()
![]()
![]()
![]()
Using PILEUP
Before you run PILEUP, it is necessary to study the sequences that will be aligned.
PILEUP is very sensitive to gaps, so if a set of sequences are of different lengths, gaps will be added to the ends of all shorter sequences to make them equal to the longest one in the set.
If you try to align five 300 nucleotide EST's with a single 20,000 nucleotide cosmid, you are adding 5 X 19,700 gaps to the alignment - and PILEUP will crash!
Instead, do a pairwise alignment between one of the ESTs and the cosmid (using GAP)
Identify the region of similarity in the longer sequence and copy that short region to a new file.
Then align six 300 nucleotide sequences.
If you are aligning a bunch of different proteins, and you know some regions are just not at all similar, cut those regions out before you do the alignment.
If you are interested just in some particular repeat or motif, extract it from the original sequence as best you can and then do the alignment.
Everything you throw into PILEUP that is not similar between all sequences just acts to gunk up the works. The final alignment may still come out right, but then again, it might not.
![]()
![]()
![]()
![]()
Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center Comments to: browns02@mcrcr.med.nyu.edu