Unix file and directory names follow the exact same naming conventions and rules.
| Unix file names are case sensitive | |
| Unix file names can be up to 256 characters long | |
| Unix file names may contain almost any character on the keyboard, however certain characters will create problems and files using these characters will create problems (and apparently the files will become undeletable and inaccessible. | |
| Characters you should not use in filenames: | ; , ! @ # $ ( ) < > / \ " ' ` ~ { } [ ] = + & ^ <space> <tab> | |
| Character delimiters you should use to make names easier to read: _ - . : (but note that the ":" has special meaning in GCG) | |
| Wild cards work somewhat differently on Unix
systems, but very similarly to DOS wild cards, Wild cards are also the same
characters as they are on DOS systems: *, ?, and so these characters also
should not be used in file names. * stands for 0 or more characters, and can be used in any place in a file name specification. ? stands for a single (1) character, and can also be used in any place in a file name. Unix understands wild cards. GCG does NOT use wild cards in the same way. |
| File name conventions. A unix file name does not determine its functionality. It is only for convenience (of the user, not the system). These conventions listed here are mostly GCG conventions and should be followed to make your life substantially easier. |
| *.seq | The file contains an RNA or DNA sequence in GCG format |
| *.pep | The file contains an Amino Acid sequence (a peptide) |
| *.msf | The file contains 2 or more aligned sequences and was created by pileup. |
| *.rsf | The file contains 1 or more sequences with extra information placed there by GCG version 9 or later. |
| *.figure | The file is a text-version of a graphics file. It must be printed or viewed by GCG from within the graphical interface to see the graphics. |
| *.progname |
Where progname is the name of some GCG program, this is the output from that program including pretty, gap, and others. |
| *.list | The file is a list of sequences and may name sequences either in some directory or sequences which are in the GCG databases. There is no restriction on what kind of sequence is named here, so the list file may contain names of DNA, RNA,. protein sequences and also may contain the names of other multiple sequence files such as msf and rsf files. |
| xxx_68.yyy |
GCG programs run within the graphical interface always create output files with an underscore followed by a number before the first (and only) period in the file name. |