CCP4 Tutorial: Session 1 - Introduction

1a) Setting up Project and Directory Aliases

The Main Window of CCP4i looks like:

Main Window of CCP4 Interface

It is composed of windows, menus and buttons. The most important of these are:

The gold-coloured bar at the top displays a help message, which will also be displayed in a "bubble" when the mouse rests on a particular part of the interface. The HELP button on the right opens the documentation pages. Help can also be obtained for a specific entry widget by right-clicking on the mouse.
The left-hand side of the main window is concerned with selection of jobs to run. The choose module pull-down menu is used to select the module. For each module, there is a list of tasks and folders. Folders are used to group related tasks within a module, and can be opened or closed by clicking on the title bar. Tasks can be selected by clicking on one. Modules, folders and tasks are described further in the next section.
In the centre of the main window is the Job List, with a list of the jobs run via CCP4i. The status of the job (STARTING, RUNNING,FINISHED, FAILED) is shown. Jobs can be selected and deselected by clicking on the relevant line in the job list. Clicking the right-hand mouse button in the job list window presents a menu of actions that can be performed on the current selection.
On the right-hand side at the top there is the Directories&ProjectDir button. This opens a window for setting directories, and choosing the current project, see tutorial 1a.
The View Any File button can be used to select and view any file using the The FileViewer Utility, for example to view an MTZ reflection file.
Then there are a number of Database Options. To use these, you must first select a job from the job list.
Next there are a number of Configurations Options. For example, you can specify non-CCP4 programs to be used.
The Mail CCP4 button allows you to send a message (comments, questions, problems) to the Interface developer. The message will be mailed immediately when SEND is pressed.
The Exit button exits the Interface.

1b) Introduction to the MTZ format

Modules, folders and tasks

CCP4i will run most frequently used programs from the CCP4 suite, but is organised around the idea of tasks rather than programs. Usually one task corresponds to one program but sometimes more than one program may be required to perform a task or a program may be used in different tasks.

The tasks are grouped into modules according to the stage in the crystallographic process they are used in (e.g. Density Improvement and Refinement are two separate modules). There are also utility modules which contain tools which apply to the three main types of data (i.e. Map & Mask Utilities, Reflection Data Utilities and Coordinate Utilities), and for graphics, viewing and Clipper. Then there is the Program List, which contains an alphabetical list of all the CCP4 programs represented in CCP4i.

Tasks may be further grouped into folders within a module. These are used to group together sets of tasks which may perform similar functions, or which are otherwise related to each other.

MTZ file format

CCP4 reflection data is held in MTZ files. MTZ files are binary, i.e. they cannot be viewed with the more command or with an ASCII viewer. However, they can be easily listed at the command line with:

> mtzdmp foo.mtz

See CCP4 documentation for options to mtzdmp. Within CCP4i, MTZ files can be viewed by clicking on a View button next to the file name or by selecting the file from View Any File or View Files from Job.

MTZ files consist of 2 parts:

Reflection data is held as a sequence of records; each record holds structure factor amplitudes, phases, etc. for one set of hkl indices. For merged MTZ files, each set of hkl indices occurs only once in the file. For unmerged MTZ files, there may be more than one occurrence of a set of hkl indices.
There is also a file header which holds global information, such as spacegroup, cell dimensions, etc.

MTZ columns

Records of reflection data in the MTZ format may hold any number of pieces of data. Equivalent pieces of data in different records are referred to as columns:


                   column
                     |
                     |

    0   0   2      626.00    112.00      3.00
    0   0   4     9111.00    168.00     22.00
    0   0   6      513.00    146.00     20.00     --- record
    0   0   8     2610.00     52.00     10.00
    0   0  10         ?         ?       11.00
    0   1   1     1200.00     38.00     13.00
    0   1   2     2244.00     55.00     21.00
    0   1   3     2163.00     36.00      6.00
    0   1   4     6057.00     82.00     13.00
    0   1   5     3698.00     46.00     16.00

The columns are given names in the MTZ file header so that they can be identified:


  * Column Labels :
 
    H K L FP SIGFP FreeRflag

When running the programs directly, these names are used in the LABIN keyword:


  LABIN F=FP SIGF=SIGFP FREE=FreeRflag

The names before the = sign are the names that appear in the program documentation. The names after the = sign are the names that appear in the file.

In CCP4i, these column assignments are made via pull-down menus, after the MTZ filename is entered.

The header contains more information which is useful for organising your data. For instance, each column has a defined type. For the above example, the types are:


  * Column Types :

    H H H F  Q     I

Column Types are used to provide an extra check that the user input assignment for a requested program label is of the correct type. For more information, see COLUMN TYPES.

The main file header also contains "dataset" properties. The columns of data are grouped into distinct "datasets". For example, a measurement and its standard deviation must be part of the same dataset. All information about a derivative makes up one data set distinct from the equivalent columns for the native which may also be present in the file. A dataset is identified by a "project name", which specifies a particular structure determination, a "crystal name", which is essentially a single crystal form, and a "dataset name". Normally, all datasets in a file will have the same "project name" and different "dataset names". Each dataset will have its own cell dimensions and wavelength. Dataset information comes into its own for Data Harvesting. For more information on project, crystal and dataset names, see MTZ FORMAT.

1e) Appendix

COLUMN TYPES

H: index h,k,l
F: structure amplitude, F
J: intensity
D: anomalous difference
G: member of Friedel pair, F+ or F-
K: member of Friedel pair, I+ or I-
Q: standard deviation of J,F,D or other
L: standard deviation of F+ or F-
M: standard deviation of I+ or I-
P: phase angle in degrees
W: weight (of some sort)
A: phase probability coefficients (Hendrickson/Lattman)
B: BATCH number
Y: M/ISYM, packed partial/reject flag and symmetry number
I: any other integer
R: any other real

It is essential to have correct column types for PHASES and ANOMALOUS differences:

to distinguish phases which will require changing if the reflection is moved to a symmetry equivalent;
anomalous differences which require changing sign if the reflection is changed to a Friedel pair.

In addition two special data types are used to signal that you are preparing data for translation functions of various types. They are:

U: partial FC
V: partial PHIC

Back to the index.