| Primary Databases | Protein Pattern and Structural Analysis | DNA Pattern Analysis | | Other Bioinformatics Resources | Other Lists of Bioinformatics Resources |
NCBI is the number one resource for molecular biologists. GenBank, and provides free BLAST and ENTREZ searches via e-mail, client software, or directly over the Web.
This is the premier Web engine for DNA and protein homology searches.
Entrez is a molecular sequence and document retrieval system, which contains an integrated view of portions of MEDLINE, and all publicly available nucleotide and protein databases. The Protein and Nucleotide entries in Entrez have been compiled from a variety of sources, including GenBank, EMBL, DDBJ, PIR, SWISS-PROT, PRF, and PDB. Entrez is extremely useful for obtaining cross-referenced documentation for a particular sequence once you know its database accession number.
The Genome Database (GDB) stores and curates human genomic mapping data submitted by researchers worldwide and provides this information electronically to the scientific community.
The Genome Sequence DataBase is dedicated to supporting scientific research and development by creating, maintaining and distributing a complete, timely, accurate and useful collection of DNA sequences and related information. The core sequence data at GSDB are incorporated within GenBank, but the GSDB is an on-line, client-server, relational database enabling complex SQL queries and much additional annotation.
Mirrors of Gen/EMBL as well as local databases including Codon Usage Database, Protein Mutant DB, C. elegans   EST Database, and Bacillus subtilis   Non-Redundant Database (NRSub). Provides sequence retrieval via "getentry", web-based homology searches with FASTA and BLAST, and Multiple Alignment using "MALIGN" and "CLUSTAL W".
OWL is a non-redundant superset of SwisProt, PIR, GenPept, and NRL-3D. Entries are amalgamated from primary source databases by a process in which redundant and trivially different entries are eliminated. An up-to-date copy of OWL is maintained at the RCR (available from within GCG).
or
A WWW implementation of SRS, similar to the LOOKUP program available at RCR.
The Protein Data Bank is an archive of experimentally determined three-dimensional structures of biological macro-molecules, serving a global community of researchers, educators, and students.
A web-based search engine for extended field searching option allows for the searching of any individual field in the PIR and combining it with any other search field to limit the scope of your search.
PROSITE is Dr. Amos Bairoch's meticulously annotated database of biologically significant protein sites, patterns and profiles that help to identify to which known family of protein (if any) a new sequence belongs. This server allows only text searches of the database.
Blocks are short multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins. The BLOCKS database is created automatically by taking the most highly conserved regions from groups of proteins in the PROSITE database and using them to search the SWISS-PROT database. These sequences are then aligned to form the BLOCKS database. An online search tool is available to compare user entered sequences against the database and also for text-based searches.
PRINTS is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterize a protein family. Usually the motifs do not overlap, but are separated along a sequence. Fingerprints can encode protein folds and functionalities more flexibly and powerfully than can single motifs. The database thus provides a useful adjunct to PROSITE. This server provides both sequence similarity and text-based database searches, an interesting interactive multiple sequence alignment editor (known as CINEMA) is also available.
The ProDom protein domain database consists of an automatic compilation of 9600 homologous domains detected in the SWISS-PROT database by the DOMAINER algorithm (Sonnhammer, E.L.L. & Kahn, D., 1994, Protein Sci. 3:482-492). The server provides sequence similarity searches of user sequences against the consensus sequences of the domains. Beautiful graphical representations are available of multiple alignments of all SwissProt sequences that contain each domain.
Pfam is a large collection of multiple sequence alignments and hidden Markov models covering most common protein domains. You can search your favorite sequence against Pfam; access, view, and download individual alignments from Pfam; or download HMMs that you can use locally if you have installed HMMER hidden Markov model software. Output pages are hyperlinked to other relevant databases, including Swissprot, Genbank, PDB, PROSITE, and Medline.
ProMod is a Protein Modeling tool which requires similarities with experimentally determined protein structures. ProMod is based on knowledge-based protein modeling methods. The structure database used by Swiss-Model is derived from the Brookhaven Protein Data Bank (PDB).
Protein motif and structural prediction tools. Offers several tools linked to the PDB including MOOSE, Protein Kinase Database, and PDB Toolbox.
Offers web-based BLAST searching of proteins domains and cross-references to the other major protein databases.
The UCLA-DOE Protein Fold-Recognition server is a new project aimed to help in the computational analysis and prediction of structure from amino acid sequences. It is a comprehensive package providing users with computation time, storage and collection of data, and organization of the results for easy analysis.
The Restriction Enzyme Database is a collection of information about restriction enzymes, methylases, the microorganisms from which they have been isolated, recognition sequences, cleavage sites, methylation specificity, the commercial availability of the enzymes, and references - both published and unpublished observations.
Paste in your DNA sequence, choose your enzymes and get an instant restriction map.
This is the home of the TRANSFAC database (implemented as the local file TFDATA at RCR). It compiles data about gene regulatory DNA sequences and protein factors binding to them. This web site provides on-line programs that help to identify putative promoter or enhancer structures within your DNA sequences and to suggest their features. This site also provides a huge list of WWW links to sources of useful biology (and other) information on the Web.
SIGNAL SCAN finds homologies of published signal sequences to your sequence, most of these are transcriptional elements.
An EXCELLENT online reference to Bioinformatics. This is an online version of a chapter from a new book to be published by Cold Spring Harbor Press called GENOME ANALYSIS: A LABORATORY MANUAL
BCM Search Launcher at the Baylor College of Medicine, Houston, TexasThe ExPASy WWW server is dedicated to molecular biology with an emphasis on data relevant to proteins. It allows you to browse through a number of databases produced in Geneva, such as SWISS-PROT, PROSITE, SWISS- 2DPAGE, SWISS-3DIMAGE and SeqAnalRef. It also allows access to various sequence analysis tools.
The BCM Search Launcher is an on-going project to organize molecular biology-related search and analysis services available on the WWW by function by providing a single point-of-entry for related searches. WWW servers are grouped into the following categories: Protein sequence/pattern searches, Nucleic acid sequence searches, Multiple sequence alignments, Pairwise sequence alignments, Gene features (motifs), Sequence utilities, Protein secondary structure prediction
They provide a web version of the GCG documentation (the same text found in GenHelp), and a list of other useful web sites.
The Dictionary of Cell Biology was first published in 1989, and has since been translated into several languages. It is intended to provide quick access to easily-understood and cross-referenced definitions of terms frequently encountered in reading the modern biology literature. This server contains the text of the Second edition, published in April 1995, together with enhancements, hypertext links and new entries which are destined for the third edition.
GCG is the home of the Wisconsin Sequence Analysis Package, the most comprehensive suite of DNA and protein sequence analysis tools available, and the core software offered by the RCR. The GCG web site offers the company newsletter, advertisements for GCG products, and some links to other biocomputing sites that offer useful information such as online documentation and tutorials for the GCG software.
This guide was written by Cary O'Donnell of the AFRC Computing Division, Harpenden Herts, AL5 2JE UK. It is widely considered to be the best available tutorial for GCG.
Provides a number of interesting services including: The Arabidopsis, Rice, Corn, Pine, and Brassica napus   cDNA Sequence Analysis Projects, The Virtual Genome Center with information about Candida albicans   molecular biology, Neuroscience Database Program, and a web-based Recombinant DNA Technology Course.
Pedro has collected an awesome set of WWW links that offer everything from on-line reading of your favorite journals to Web-based multiple sequence alignment tools. Virtually every biologist's work can benefit from the resources listed on this page.
This page organizes links to existing search engines in a coherent, stepwise fashion.
A commercial site (funded by vendors whose products are featured) that contains many useful links and embedded mini-search engines for specific databases.
A huge, well organized list of "biosciences" resources available on the Internet.
An excellent and no-nonsense collection of links with an emphasis on protein databases.
Here are a series of pointers to Internet resources of biological interest. These are tools used by our scientists on a daily basis. We hope they can be of help in your work.
Yahoo is the ultimate, comprehensive, hierarchically organized list of Internet resources. If you can't find it anywhere else, then look here.