next next next index

Dotplots

*** The concept of similarity can be illustrated by a simple dot plot. Dot plots are used to visually compare two sequences and detect of regions of close similarity between them.

*** In Figure 1 below, two sequences are arranged along the axes of a simple graph. At every point where the two sequences are identical, a dot is placed (i.e. at the intersection of every row and column that have the same letter in both sequences).



Figure 1. Simple dot plot


*** A diagonal stretch of dots will indicate regions where the two sequences are similar. A trained eye can pick out a diagonal pattern of similarity in Figure 1, but statistical methods can be applied to make the results more apparent.


*** A filter can be used to place a dot only when a group of successive bases match.

*** In Figure 2 the same dot plot is shown with a filter such that a dot is printed only if in a window of 4 bases, 3 of these 4 bases match.

*** To detect more distant similarities, it may be better to use a much larger window (ie. 20, 30, or even 50 bases) and some suitable percentage of identities (perhaps 50%).

Figure 2. Dot plot with a 4 base 75% identity window



However, for real data, these patterns may not be quite so obvious.



Figure 3. A sample dot plot of two 230 AA database sequences




next next next index

Using Computers for Molecular Biology
Stuart M. Brown, Ph.D, RCR, NYU Medical Center
Comments to: browns02@mcrcr.med.nyu.edu