next next next index

Restriction Mapping

*** Making restriction maps is a routine lab activity that is necessary for any type of cloning project.

*** In addition, maps are a common way for labs to archive information about entire libraries of plasmid constructs.

*** For archival purposes, it is important that map data are stored in a reliable format that is broadly supported in the bioinformatics community so that the obsolescence of a particular computer program does not render archives unusable.

*** High quality maps are important for publications and exchange of information between researchers or between labs.

*** GCG has a perfectly functional set of restriction mapping tools, but as with all GCG programs, a certain amount of learning is required to get the most out of them.

*** Since restriction mapping is such a computationally trivial problem, many researchers prefer to use a variety of custom software for personal computers.

*** If all you need to know is "where are the BamHI sites in pUC18, then virtually any software will suffice.

*** If you need to print out publication quality circular maps of your vector construction strategy, then a custom Macintosh application would probably be your best bet.

*** I am not aware of any first rate mapping applications available via free WWW servers.




MAP

*** MAP is the GCG restriction enzyme mapping program. Like a lot of GCG programs, it is very powerful and quite complex.

*** By default, MAP includes protein translations (in 3 forward reading frames), this can be changed to any, all 6, or none.

*** Restriction sites can be mapped for all enzymes (the default), or any enzymes that you specify by name

*** The output can be viewed as a linear map or in a table format (with the -TABLE option).

*** You can also select enzymes that have a 6 base or larger recognition sites (six-cutters in lab slang) with the parameter -MINS=6, just blunt end cutters (-OVERHANG=0) or just 5' and/or 3' overhangs (-OVERHANG=5, -OVERHANG=3, or -OVERHANG=5,3).

*** MAP offers a large number of other customization options that include:

  • treating a sequence a circular (-CIR )

  • allowing for mismatches between an enzyme's recognition site and the corresponding site in your test sequence (-MIS )

  • limiting the output to enzymes that cut just once (-ONCE ), at least two (or any number) times (-MINCUTS=2), no more than two (or any number) of times (-MAXCUTS=2), or those that do not cut at all (-NONCUT ).

  • specific regions of your test sequence can be excluded from the enzyme search (-EXCLUDE=1,200),

  • the -SILENT option allows you to search for places where a single base mutation can create a recognition site for a specific enzyme.

*** Read the program documentation for even more customization options.




FINDPATTERNS

*** FINDPATTERNS is a GCG program that allows you to search a sequence for short patterns such as restriction enzyme recognition sites, promoter binding sites, etc.

*** If you are looking for a restriction site that is not in the list available to MAP (i.e. not in the current version of REBASE) or for a variant on a restriction site that cannot be specified with the -MISMATCH or -SILENT options of MAP, then try FINDPATTERNS.

*** FINDPATTERNS can be run interactively so that it simply asks you to type in the pattern that you are searching for and the sequence(s) to search.

*** More complex patterns can be supplied as a "pattern data" text file (see the GCG documentation for FINDPATTERNS for the file format).

*** Mismatches are allowed with the -MIS command line option as well as other MAP options such as -CIR , -MINCUTS , -MAXCUTS , and -ONCE.




MAPPLOT

*** MAPPLOT uses the same algorithm (and the same options) as MAP, but creates a graphical output designed for a plotter

*** These output files can be saved in GCG's FIGURE format, transferred to a Mac by FTP, and then viewed and printed quite nicely with GCG's free Mac program "GCGFigure ".

*** It can also create a text form of its output file suitable for printing on an ordinary laser printer.

*** See the notes on GCG Graphics for instructions.




MAPSORT

*** MAPSORT simulates a restriction digest and allows you to predict the sizes of the digest products with any combination of enzymes.

*** It uses essentially the same set of options as MAP and MAPPLOT.




PLASMIDMAP

*** PLASMIDMAP is a GCG program that produces a "publication quality circular map" of a plasmid construct.

*** It uses input files generated by MAPSORT and a text file that contains data about blocks and ranges within a sequence to create output in the GCG FIGURE format.

*** The use of this program is too complex to explain here. This program has great power and can be used to produce very elegant and intricate figures, but it is not for the GCG novice (or anyone with little patience).




MacVector & OMIGA

*** Making and printing restriction maps in MacVector and OMIGA is so simple that I don't need to cover it here.

*** Many people find these programs to be superior to GCG for this function.

*** I do not recommend using these programs to archive data for plasmid constructs.

*** The file formats used are proprietary and there is no guarantee that these files will be readable a few years from now given the ever accelerating pace of change in the personal computer industry.

*** GCG uses universal text and graphics file formats for all of its program output, so you can be reasonably sure that this data should always be readable.





Strider

For simple restriction mapping projects, nothing beats the old reliable DNA Strider program. This was written way back in 1989 by Christian Marck. The current status of this program is unclear, but an older version is available in the
Academic Computing online software archive.


next next next index

Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center
Comments to: browns02@mcrcr.med.nyu.edu