next next next index

NetFETCH

*** GCG version 10 has introduced a new programs called NetFETCH that retrieves sequences directly from the NCBI's NetENTREZ web server. This insures compatibility with sequence names/numbers found in BLAST and ENTREZ searches.

*** NetFETCH can be used to retrieve sequences by name or accession number. It also can retrieve multiple sequence by placing a comma between sequence names or accession numbers.

*** NetFETCH will retrieve an entire list of sequences found in a NetBLAST search. Simply type the name of the NetBLAST output file as input for NetFETCH:
> NetFETCH frag.blastp
*** The output of NetFETCH is a complete listing of the GenBank acessions including all comment and annotations in a format known as .RSF. This .RSF file can be loaded into SeqLab, or used as input for any GCG program the handles multiple sequences - such as PILEUP.

*** An .RSF file that contains muliple sequences must be handled similarly to the .MSF file format, by following the filename with a {*} symbol, as shown in this example:

> PILEUP frag.rsf{*}
*** An .RSF file that contains a single sequence is more easily dealt with if you first REFORMAT it into a standard GCG sequence file:
	> netfetch af026976 
	NetFetch retrieves sequences from NCBI listed in a NetBLAST
	output file. You can also use it to retrieve sequences individually by sequence	
	name or accession number.  The output of NetFetch is an RSF file.

 	What should I call the RSF output file (* af026976.rsf *) ?

 NETFETCH complete with:

     Output: af026976.rsf
     Server: www.ncbi.nlm.nih.gov
  Requested: 1
   Returned: 1
	> reformat af026976.rsf{*} 

	Reformat rewrites sequence file(s), scoring matrix file(s), or
	enzyme data file(s) so that they can be read by GCG programs.

    	af026976.seq  length: 3113 bp

	> ls
	af026976.rsf  af026976.seq
*** Watch out for the syntax of the REFORMAT command. The {*} must follow the filename without a space. If you insert a space, GCG reformats the file into gibberish - turning the annotation into part of the sequence, and it deletes your original .RSF file, so you have to start over again with NETFETCH.


next next next index

Using Computers for Molecular Biology
Stuart M. Brown, Ph.D., RCR, NYU Medical Center
Comments to: browns02@mcrcr.med.nyu.edu