Research Computing Newsletter


Research Computing News

Volume 7, Number 2
September 1999



Contents


The Medical Center Installs a Network Firewall

Towards the end of May, it was found that a limited number of computers in the Medical Center had been broken into by an automated attack using a newly discovered vulnerability in SUNıs Solarisoperating system. Because the nature of the attack was such that many machines at the MC could have been vulnerable, a protective firewall was established to isolate our network from the outside world.

This firewall is restrictive, and researchers in the MC have had difficulty in living with its limitations. Unfortunately, there is no major amelioration of the situation likely in the near future. Within a month we expect that a new firewall, based on much better technology, will be installed and the use of some basic services, such as FTP and the BLAST, server will be restored.

Real relief will come when the first phase of a network re-engineering project is completed, which will create a zoned network where users will be placed in zones with different levels of access to, and protection from, the Internet. People placed in the ³open zone² will have nearly the same environment they had before the firewall was installed. There will, however, be significant restrictions on the machines that qualify, based on the data and services they use, the network infrastructure they are attached to and the level of support they have for their computer. Setting up an open zone that is sufficiently secure will take time and commitment from those who would benefit from it.

The break-in was a nasty reminder of the dangers that the Internet harbors. The clean-up that is still underway is an even nastier taste of the very high costs that such an incident can incur. In view of this the MC is committed to moving the network to a level of security that will provide appropriate protection for all the activities for which it is responsible.

To read the FAQ about the firewall and documentation on working within it, please look at:

http://www.med.nyu.edu/NYUMConly/netfaq.html



PILEUP web interface

The RCR has developed an interactive web interface for the GCG multiple alignment program known as PILEUP at this URL:

http://mcrcr0.med.nyu.edu/rcr/pileup.html

PILEUP is one of the RCR's most used programs, and one of the most difficult to use due to the cumbersome command line interface of GCG. PILEUP is particularly annoying, even for experienced GCG users, because it requires the user to first build a text file that is a list of names and/or accession numbers of sequences to be aligned. The web interface to PILEUP has greatly simplified this process by automatically building the list file.

The web interface to PILEUP, programmed expertly by Suzy Gottesman, allows the user to align a set of sequences by either copying and pasting sequences one by one directly into a text entry area in their web browser or by logging in to their RCR account and selecting sequence files from their directories.


PILEUP web interface screen
shot


The ability to use the web to access files stored in personal accounts on the RCRıs Alpha server (MCRCR0) is a major innovation. This feature will be a starting point for developing web interfaces to many other GCG programs as we continue our development efforts.

The PILEUP web interface has several other useful features. When building a list of sequences from an RCR account, users can select a file by name and view that sequence in a new browser window with the ³View a File² button. Users can also locate files in sub-directories using the ³Change Directory² and ³Up One Directory² buttons.

Once a set of sequences has been chosen for alignment with PILEUP, then the ³Run Pileup² button starts the program. The multiple alignment is displayed directly in the browser window. After inspecting the results, the ³Run Pileup Again² button allows the user to re-run the program with modified gap creation or gap extension settings. The ³Go Back to Building List File² button allows the user to add or subtract sequences from the alignment.

Please make use of this web program and send suggestions on how it can be improved to Stuart Brown:
browns02@med.nyu.edu





Tirza Doniger joins the RCR

Tirza Doniger has recently joined the staff of the Research Computing Resource as Systems Manager. She received her Bachelor's Degree in Computer Science from Queens College in June of 1999. Her responsibilities include managing the Alpha server, as well as developing software to support Molecular Biology research. Tirza will handle the creation of new RCR accounts, forgotten/expired passwords, routine backups, and the recovery of lost files from backup tapes. Contact her at donigt01@mcrcr0.med.nyu.edu or by phone at 263-7136.


Bioinformatics Training Resource

Bioinformatics is a computer intensive field, so it is quite natural to find that experts in this discipline tend to post a lot of useful information on the Web. In fact most professors teaching bioinformatics courses, book authors, and creators of important algorithms have extensive web sites filled with tutorials and educational information. The challenge is to locate the resources that you need for your personal education without spending months of your time surfing the web searching for these gems.

There is a clear need for an organized central resource that collect links to bioinformatics training material. The Bioinformatics Training Resource (BTR), a new sub-section of the RCR website, is an attempt to fill this need:

http://mcrcr0.med.nyu.edu/rcr/btr/index.html

The BTR is an organized, annotated collection of links to online tutorials, online courses, essays, book chapters, course syllabi, glossaries, bibiliographies of key papers, etc. In short everything that interested scientists need in order to train themselved in the emerging discipline of bioinformatics.

There are a few drawbacks in attempting to learn from the experts. Some of the material listed in the BTR is of a highly technical nature, most presumes a thorough understanding of molecular biology, some even contains mathematics!!! Therefore, we have included some links to basic tutorials in molecular biology and related subjects to help you get up to speed. There is also a section called ³Getting Started in Bioinformatics² that contains material suitable for high school students and those unfamiliar with biology and/or computer science. In addition, there are links to several online Glossaries where biological and computing terms are thoroughly explained.

Please feel free to submit addional resources, comments/reviews of the resources, or general suggestions to improve the BTR website. Please sent all comments to Stuart Brown:
browns02@med.nyu.edu

- Stuart Brown






Y2K Website for NYU School of Medicine

The School of Medicine's Y2K remidiation effort continues and its effects have been felt throughout the institution. Even though you and your department have already been busy collecting data and fixing problems, please be sure you are staying completely up-to-date. Note that we have posted detailed help for computer upgrades and the software too. There are training sessions organized for testing computers and equipment, and there are lists of frequently-asked questions and answers. All of this information is available on the Y2K web site:

http://www.med.nyu.edu/Year2000

Take time to look through the material prepared by the MC, NYU centrally and the Mount Sinai School: things you discover may save you time and aggravation in the coming months.. Every little bit helps!





PIR Database is Expanded

The PIR (Protein Information Resource) database now contains a complete, non-redundant set of all known protein sequences. The PIR was established in 1984 by the National Biomedical Research Foundation (NBRF) at Georgetown University Medical Center; it was formerly known as the ³Atlas of Protein Sequence and Structure,² produced by Margaret O. Dayhoff from 1965-1978.

PIR has long been used as the default protein sequence database for FASTA searches in the RCRıs GCG program. PIR is a good starting point for protein database searching, slightly more comprehensive than SwissProt, but much smaller and better annotated than GenPept (which contains translations of all GenBank sequences including many ³hypothetical sequences² of unknown function).

Now, in collaboration with MIPS (Munich Information Center for Protein Sequences) at the Max Planck Inst. for Biochemistry in Martinsried, Germany, the NBRF has added a new section to PIR called PATCHX that contains a non-redundant set of all ³other² protein sequences that are not included in PIR. According to the NBRF: ³When the PIR-International Protein Sequence Database is supplemented by PATCHX, together they provide the most complete collection of protein sequence data currently available in the public domain.²

The PATCHX database is built from translated sequnces from EMBL and GenBank, and protein sequnces from SwissProt, NRL3D and Kabat. In addition, all sequences that are completely contained within others (subsequences), have been removed. Also, sequences with very high similarity to sequences in PIR-International are excluded from PATCHX.

The current PIR release #61 (June, 1999) of the database has 145,677 sequence entries (including 14,791 in the NRL_3D section of 3D protein sequences from the PDB) and PATCHX contains an addtional 149,837 sequences.

RCR users do not need to use any special commands to include the PATCHX sequences in their searches. Any search using the PIR database will automatically include PATCHX. The comprehensive yet non-redundant nature of PIR should make this the database of choice for most sequence similarity searches. The SwissProt database is still useful when one wishes to limit a search to just well annotated sequences, and GenPept is useful when one is interested in searching all possible sequences (such as translations of ESTs that have unknown functions).

PIR should now completely replace the non-redundant OWL database. Researchers who wish to retain OWL on the RCRıs computer system should contact Stuart Brown (browns02@med.nyu.edu), otherwise we will be discontinuing the OWL database.





Only Youcan Stop Viruses

As security experts keep hounding users and corporation to use antivirus software, the rate of virus infections is still rising ‹ despite most PCs and servers having antivirus software installed, according to an annual survey of 300 large corporations conducted by ICSA Inc.

The 1999 survey shows that the difference between effective and non-effective use of anti-virus products translates into a 20-fold virus risk reduction. Unfortunately, according to this survey, at least 40% of companies do not utilize anti-virus software on most desktops in full-time, background, automatic mode ­ when all available evidence shows that this is the single, most effective method of use. It is basic common sense that the minor irritation of waiting for a few moments while diskettes are scanned for viruses is well worth it, when the alternative is the risk of serious damage to your computer.

Make sure that you have anti-virus software and update it each month. It is essential! And it is NYUMC policy!

The likelihood of experiencing a computer virus has approximately doubled for each of the past four years. The average rate of infection was 88 virus incidents per 1000 PCs per MONTH in February 1999 despite the fact that nine out of most PCs are protected by some form of antivirus software. More than two-thirds (69%) of corporate IT mangers reported virus disasters (25 or more PCs or servers infected at the same time) in the three months preceding the interviews. More than half of the respondents had encountered viruses sent via e-mail in their virus disasters. Companies tend to have continuous small problems with viruses (almost two incidents per week for a 1000 PC company) with relatively less common virus disasters which occur at a rate of roughly once per year.

Itıs not enough for companies and users to install antivirus software on servers and desktops, it is essential to update the software regularly. Companies must also implement security policies and educate users, such as warning them not to open documents if they donıt know the sender. Viruses have become very dynamic, spreading through downloads and attachments.

As an adjunct to the efforts of each individual, the Medical Center has set up a process that will filter in-coming mail attachments for viruses. This has been quite successful so far and we have caught several bad viruses in-bound to vulnerable machines. BUT mail filtering can never provide complete coverage since it is only applied to the central mail servers and mail can reach you by paths that bypass these machines or the filter. It is essential that you donıt relax vigilance in managing your personal anti-virus software which provides protection for all the functions on your machine, not just mail.

The addition of mail filtering for viruses should help, but only as a part of a comprehensive effort to protect computers in which your own anti-virus software continues to play the central role.