![]()
![]()
![]()
![]()
Dotplots
The concept of similarity can be illustrated by a simple dot plot. Dot plots are used to visually compare two sequences and detect of regions of close similarity between them.
In Figure 1 below, two sequences are arranged along the axes of a simple graph. At every point where the two sequences are identical, a dot is placed (i.e. at the intersection of every row and column that have the same letter in both sequences).
Figure 1. Simple dot plot
A diagonal stretch of dots will indicate regions where the two sequences are similar. A trained eye can pick out a diagonal pattern of similarity in Figure 1, but statistical methods can be applied to make the results more apparent.
A filter can be used to place a dot only when a group of successive bases match.
In Figure 2 the same dot plot is shown with a filter such that a dot is printed only if in a window of 4 bases, 3 of these 4 bases match.
To detect more distant similarities, it may be better to use a much larger window (ie. 20, 30, or even 50 bases) and some suitable percentage of identities (perhaps 50%).
![]()
Figure 2. Dot plot with a 4 base 75% identity window
However, for real data, these patterns may not be quite so obvious.
![]()
Figure 3. A sample dot plot of two 230 AA database sequences
![]()
![]()
![]()
![]()
Using Computers for Molecular Biology
Stuart M. Brown, Ph.D, RCR, NYU Medical Center Comments to: browns02@mcrcr.med.nyu.edu