next next next index

ENTREZ

***In many situations, the best way to find sequences is to use the Web with a tool called ENTREZ.

***ENTREZ is a service of NCBI (National Center for Biotechnology Information) which is a part of the US National Library of Medicine.

***The ENTREZ database contains all of the nucleotide and protein sequences in GenBank (updated daily!) along with a sequence-associated subset of MEDLINE. But ENTREZ is much more than a database, it is a both a powerful search engine and a pre-computed list of relationships between all data elements.

***In practice, this means that you can search for a text term in sequence annotations or in MEDLINE abstracts, and find all articles, DNA, and protein sequences that mention that term. Then from any article or sequence, you can move to "related articles" or "related sequences".

***Relationships between sequences are computed with BLAST

***Relationships between articles are computed with "MESH" terms (shared keywords)

***Relationships between DNA and protein sequences rely on accession numbers

***Relationships between sequences and MEDLINE articles rely on both shared keywords and the mention of accession numbers in the articles.

***These pre-computed relationships might include genes in the same multi-gene family, articles written about genes that have the same function, or other proteins that function in the same biochemical pathway.

***This potential for "horizontal movement" through the database makes ENTREZ really exciting. It allows you to start with only a vague set of keywords or a sequence identified in the laboratory and rapidly access a set of relevant literature and a list of related database sequences.



ENTREZ is best accessed via the WWW at:
http://www3.ncbi.nlm.nih.gov/Entrez/.


***There is also a stand-alone client application called NENTREZ that can be used without a WWW browser (but it still requires internet access).
***This NENTREZ program can also be used in conjunction with Netscape (as a plug-in) to create a cool 3-D sequence structure browser.
***Sequences identified with an ENTREZ search must be copied into a text file on your desktop computer, and then transferred to your RCR account for further work with GCG.
***The entire ENTREZ database was distributed on CD-ROM from 1992 until August, 1996, but this was discontinued due to the huge size of the database (which filled 6 CD's by 1996) and the impossible task of keeping the CDs current with the rapidly growing database.


next next next index

Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center
Comments to: browns02@mcrcr.med.nyu.edu