![]()
![]()
![]()
![]()
UNIX Filenames explained
First, remember that UNIX is case sensitive! So a file named
Myjunk.txtis different thanMyJunk.txtwhich is also different thanmyjunk.txtThis is a big change for VMS users, so get used to it!
UNIX filenames only contain letters, numbers, and the _ (underscore) and . (dot) characters. All other characters should be avoided. The / (slash) character is especially important, since it is used to designate subdirectories. Also the ; character which is used in all VMS filenames can create difficulties in UNIX - avoid it like the plague!
Most UNIX filenames start with a lower case letter and end with a dot followed by one, two, or three letters. However, this is just a common convention and is not required.
It is also possible to have additional dots in the filename. (This was not allowed in VMS.)
The part of the name that follows the dot is often used to designate the type of file:
- files that end in
.txtare text files
- files that end in
.care source code in the "C" language
- files that end in
.htmlare HTML files for the Web.
But this is just a convention and not a rule enforced by the operating system.
This is a good and sensible convention and one that you should follow.
It is also quite handy to use extensions to name related files for a single project, or types of files. I like to use
.seqfor DNA sequences and.pepfor protein sequences. GCG programs tend to put their own extensions onto their output files - this is very handy - later you will know that files named.fastaare the output from FASTA searches.
UNIX does not allow two files to exist in the same directory with the same name. Whenever a situation occurs where a file is about to be created or copied into a directory where another file has that exact same name, the new file will overwrite (and delete) the older file. UNIX will generally alert you when this is about to happen, but it is easy to ignore the warning, or to use a program that ignores this warning for you.
Most GCG programs have a default name for the output file. If you accept this default, then you will generally be overwriting some previous file that had the same name, even though the contents of those files may be very different. Always choose a new name for the output file and try to be as informative and specific as possible; you will be thankful later.
![]()
![]()
![]()
![]()
Using Computers for Molecular Biology
Stuart M. Brown, Ph.D, RCR, NYU Medical Center Comments to: browns02@mcrcr.med.nyu.edu