Making restriction maps is a routine lab activity that is necessary for any type of cloning project.
In addition, maps are a common way for labs to archive information about entire
libraries of plasmid constructs.
For archival purposes, it is important that map data are stored in a reliable format that is broadly supported in the bioinformatics community so that the obsolescence of a particular computer program does not render archives unusable.
High quality maps are important for publications and exchange of information
between researchers or between labs.
GCG has a perfectly functional set of restriction mapping tools, but as with
all GCG programs, a certain amount of learning is required to get the most out
of them.
Since restriction mapping is such a computationally trivial problem, many researchers prefer to use a variety of custom software for personal computers.
If all you need to know is "where are the BamHI sites in pUC18, then virtually any software will suffice.
If you need to print out publication quality circular maps of your vector construction strategy, then a custom Macintosh application would probably be your best bet.
I am not aware of any first rate mapping applications available via free WWW
servers.
MAP
MAP is the GCG restriction enzyme mapping program. Like a lot of GCG programs,
it is very powerful and quite complex.
By default, MAP includes protein translations (in 3 forward reading frames), this can be changed to any, all 6, or none.
Restriction sites can be mapped for all enzymes (the default), or any
enzymes that you specify by name
The output can be viewed as a linear map or in a table format
(with the -TABLE option).
You can also select enzymes that have a 6 base or larger recognition
sites (six-cutters in lab slang) with the parameter
-MINS=6, just blunt end cutters (-OVERHANG=0)
or just 5' and/or 3' overhangs (-OVERHANG=5,
-OVERHANG=3, or -OVERHANG=5,3).
MAP offers a large number of other customization options that include:
- treating a sequence a circular (-CIR )
- allowing for mismatches between an enzyme's recognition site and the
corresponding site in your test sequence
(-MIS )
- limiting the output to enzymes that cut just once (-ONCE
),
at least two (or any number) times (-MINCUTS=2), no more than
two (or any number) of times (-MAXCUTS=2), or those
that do not cut at all (-NONCUT ).
- specific regions of your test sequence can be
excluded from the enzyme search (-EXCLUDE=1,200),
- the -SILENT option allows you to search for places where a
single base mutation can create a recognition site for a specific enzyme.
Read the program documentation for even more customization options.
FINDPATTERNS
FINDPATTERNS is a
GCG program that allows you to search a sequence for short patterns such as restriction enzyme recognition sites, promoter binding sites, etc.
If you are looking for a restriction site that is not in the list available to
MAP (i.e. not in the current version of REBASE) or for a variant on a
restriction site that cannot be specified with the -MISMATCH
or -SILENT options of MAP, then try FINDPATTERNS.
FINDPATTERNS can be run interactively so that it simply asks you to type in the
pattern that you are searching for and the sequence(s) to search.
More complex patterns can be supplied as a "pattern data" text file
(see the GCG documentation for FINDPATTERNS for the file format).
Mismatches are allowed with the -MIS command line option as
well as other MAP options such as -CIR , -MINCUTS
, -MAXCUTS , and
-ONCE.
MAPPLOT
MAPPLOT uses the same
algorithm (and the same options) as MAP, but creates a graphical output
designed for a plotter
These output files can be saved in GCG's FIGURE format, transferred to a Mac by FTP, and then viewed and printed quite nicely with GCG's free Mac program "GCGFigure ".
It can also create a text form of its output file suitable for printing on an ordinary laser printer.
See the notes on GCG Graphics for instructions.
MAPSORT
MAPSORT simulates a
restriction digest and allows you to predict the sizes of the
digest products with any combination of enzymes.
It uses essentially the same set of options as MAP and MAPPLOT.
PLASMIDMAP
PLASMIDMAP is a GCG program that produces a "publication quality circular map" of a plasmid
construct.
It uses input files generated by MAPSORT and a text file that contains data about blocks and ranges within a sequence to create output in the GCG FIGURE format.
The use of this program is too complex to explain here.
This program has great power and can be used to produce very elegant and
intricate figures, but it is not for the GCG novice (or anyone with little
patience).
MacVector & OMIGA
Making and printing restriction maps in MacVector and
OMIGA is so simple that I don't need to cover it here.
Many people find these programs to be superior to GCG for this function.
I do not recommend using these programs to archive data for plasmid constructs.
The file formats used are proprietary and there is no guarantee that these files will be readable a few years from now given the ever accelerating pace of change in the personal computer industry.
GCG uses universal text and graphics file formats for all of its program
output, so you can be reasonably sure that this data should always be readable.
Strider
For simple restriction mapping projects, nothing beats the old
reliable DNA Strider program. This was written way back in 1989
by Christian Marck. The current status of this program is unclear,
but an older version is available in the
Academic Computing online software archive.
Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center
Comments to: browns02@mcrcr.med.nyu.edu