CCP4 web logo CCP4i: Graphical User Interface
Documentation for Programmers

CCP4i and MTZ Datasets: Programmer's Cookbook

Introduction

The original design of CCP4i predated the introduction of the MTZ "crystal", "dataset" and "project" header information, and as a result support for these data within CCP4i has often been rather patchy.

Over time some efforts have been made to allow developers who are responsible for existing and new task interfaces within CCP4i to exploit this data. However the ways of doing this are not always obvious, and so this document is intended to outline how to make use of the data within CCP4i.

The format follows that of the "cookbook"-type manuals, which offer relatively generic "recipes" that aim to solve a particular problem that the programmer might encounter. For each recipe, the problem and solution are stated, followed by a discussion of the solution.

This document assumes that you have some familiarity already with the concepts of crystals, datasets and projects within MTZ files. If not, or if you need to brush up on the details, then please refer to the MTZ file format documentation:

The Recipes

Contents

  1. How to find the core CCP4i commands for accessing MTZ data
  2. How to obtain lists of the crystals, datasets and columns in an MTZ file
  3. How to obtain the type of an MTZ column
  4. How to find out which dataset an MTZ column belongs to
  5. How to access general MTZ header information
  6. How to access dataset-specific information
  7. How to get the resolution limits for a particular MTZ column
  8. How to update dataset-specific cell parameters within a task interface

1. How to find the core CCP4i commands for accessing MTZ data

Problem

You want to find information about which core CCP4i commands are available for accessing MTZ data, and how to use them.

Solution

See the CCP4i documentation. Specifically:

Discussion

The core documentation is generated from the source code, which contains doc-comments; the source code for the commands are located in the $CCP4/ccp4i/src/CCP4_utils.tcl file.


2. How to obtain lists of the crystals, datasets and columns in an MTZ file

Problem

You need to obtain lists of:

Solution

For each of these there is a specific command:

Discussion

The following code fragment gives an example of traversing the hierarchy of an MTZ file using some of these commands:

proc descend_example { mtzfile } {
    foreach xtal [GetMtzXtals $mtzfile] {
       # Loop over crystals
       puts "Crystal: $xtal"
       foreach dataset [GetMtzDatasetsInXtal $mtzfile $xtal] {
          # Loop over datasets 
          puts "\tDataset: $dataset"
          foreach col [GetMtzColsInDataset $mtzfile $xtal $dataset] {
             # Loop over columns
             puts "\t\t$col"
          }
       }
    }
    return
}

GetMtzAllCols provides a simple way to acquire a list of all the columns in the file, if you are not particularly interested in the first instance in the crystal and dataset associations - these can always be acquired afterwards using recipe 4: How to find out which dataset an MTZ column belongs to.

More sophisticated alternatives to GetMtzAllCols include GetMtzColumnList, GetMtzColumnByType and GetMtzGroupByType - each of which provides some degree of selection criteria to be applied to the columns before they are returned.


3. How to obtain the type of an MTZ column

Problem

You need to find out what type an MTZ column is.

Solution

Use the GetMtzColType command to get the type of an MTZ column:

GetMtzColType file label

file is the name of the MTZ file, label is the MTZ column label of interest.

Discussion

MTZ column types are typically single character codes that indicate that the data is of a particular type. For example, type F indicates a structure factor amplitude, while Q indicates a sigma.

The full list of types can be found in the MTZ file format documentation:

The typing information can be used to group sets of columns in a sensible way. The GroupMtzCols command will reformat an existing list of columns so that pairs of related columns (e.g. a structure factor amplitude and its sigma) or groups of four related columns (e.g. F(+) and F(-) and their sigmas) are grouped together in sublists.


4. How to find out which dataset an MTZ column belongs to

Problem

You need to find out which dataset an MTZ column belongs to.

Solution

Use the GetMtzDatasetFromLabel command to get the parent dataset for an MTZ column label:

GetMtzDatasetFromLabel file label xtalVar datasetVar

file is the name of the MTZ file and label is the column label of interest. xtalVar and datasetVar are the names of variables in which the data is returned.

Discussion

Datasets are uniquely specified by a crystal name-dataset name pair, which is returned by this command. For example:

proc parent_dataset_example { mtzfile label } {
   # Get the names of the parent dataset-crystal pair
   if { [GetMtzDatasetFromLabel $mtzfile $label xtal dataset] } {
     puts "Label $label in file $mtzfile"
     puts "Crystal: $xtal Dataset $dataset"
   }
   return
}

An example of using GetMtzDatasetFromLabel is also shown in recipe 8: How to update dataset-specific cell parameters within a task interface.


5. How to access general MTZ header information

Problem

You need to access the general data from an MTZ file header within CCP4i, for example the spacegroup or number of batches.

Solution

Use the GetMtzParam command with the appropriate keyword:

GetMtzParam file param dataVar

file is the full path and filename of the source MTZ file; param is the name of the keyword that is associated with the data that you want; dataVar is the name of the variable that the data will be returned in.

Discussion

The following keywords (specified via the param argument) are available to fetch the data:

   NBATCHES          : number of batches in the file
   NDATASETS         : the number of datasets in the file.
   NCOL              : number of columns

   CELL              : the cell parameters
   CELL_1            : global cell length "a"
   CELL_2            : global cell length "b"
   CELL_3            : global cell length "c"
   CELL_4            : global cell angle "alpha"
   CELL_5            : global cell angle "beta"
   CELL_6            : global cell angle "gamma"

   RESOLUTION_RANGE  : the resolution limits across the whole file
   RESOLUTION_MIN    : the minimum resolution across the whole file
   RESOLUTION_MAX    : the maximum resolution across the whole file

   SPACE_GROUP_NAME  : the name of the spacegroup
   SPACE_GROUP_NUMBER: the corresponding number
   LATTICE           : the lattice type

   SCALES_COLUMN   : whether the file contains a "SCALES" column

An example usage might be to get the spacegroup number from the file header:

proc spacegroup_example { mtzfile } {
   if { [GetMtzParam $mtzfile SPACE_GROUP_NUMBER spgnum] } {
      return $spgnum
   } else {
      return 0
   }
}

GetMtzParam is best suited for returning "global" properties of the file; for dataset-specific properties, see recipe 6 How to access dataset-specific information.


6. How to access dataset-specific information

Problem

You need to access the dataset-specific cell parameters or wavelength within a CCP4i task interface.

Solution

Use the GetMtzParamFromDataset command to get the value of a specific parameter associated with a dataset:

GetMtzParamFromDataset file xtal dataset param dataVar

file is the full path and filename of the source MTZ file; param is the name of the keyword that is associated with the data that you want; dataVar is the name of the variable that the data will be returned in.

Discussion

GetMtzParamFromDataset is a more specific version of the GetMtzParam command. The following keywords (specified via the param argument) are available to fetch the data:

   DCELL_1: the cell length 'a' for the dataset.
   DCELL_2: the cell length 'b'.
   DCELL_3: the cell length 'c'.
   DCELL_4: the cell angle 'alpha'.
   DCELL_5: the cell angle 'beta'.
   DCELL_6: the cell angle 'gamma'.
   DWAVES : the wavelength associated with the dataset.

An example of using GetMtzParamFromDataset is shown in recipe 8: How to update dataset-specific cell parameters within a task interface. That recipe also shows how GetMtzParamFromDataset can be combined with GetMtzDatasetFromLabel (see recipe 4: How to find out which dataset an MTZ column belongs to) to get data associated with a particular column.

One thing to note is that although in the MTZ libraries the cell is a property of a crystal, in CCP4i it is a property of the crystal-dataset pairing. This is an implementation detail and not a conceptual difference - all datasets associated with the same crystal will have the same cell within CCP4i.


7. How to get the resolution limits for a particular MTZ column

Problem

You want to get the resolution limits (high and low resolution) for a specific column.

Solution

Use the GetMtzColumnResolution command:

GetMtzColumnResolution file label maxresoVar minresoVar

file is the name of the MTZ file, label is the column label, and maxresoVar and minresoVar define the variables in which the minimum and maximum resolution limits will respectively be returned in.

Discussion

The RESOLUTION_MIN and RESOLUTION_MAX keywords in GetMtzParam return the "global" resolution limits (see recipe 5: How to access general MTZ header information). However the resolution limits associated with individual columns in the file can be significantly different.

For most other properties the values for a particular column can be accessed by first acquiring the associated dataset for the column (recipe 4 How to find out which dataset an MTZ column belongs to) and then looking up the value using GetMtzParamFromDataset command (recipe 6 How to access dataset-specific information).


8. How to update dataset-specific cell parameters within a task interface

Problem

You need to update the dataset-specific cell parameters within a CCP4i task interface whenever a user selects a new MTZ file or MTZ column label.

Solution

This recipe involves threading together a number of different pieces:

  1. In the taskname_task_window procedure, use the -command argument of the CreateLabinLine command to invoke a callback function whenever the user selects a label. For example:

       CreateLabinLine line \
          "F and sigF" HKLIN "F" F [list F] -sigma "sigF" SIGF {} \
          -command "set_xtal_cell $arrayname"
    
  2. In the same procedure, make sure that the callback function is also invoked by the -command option of the CreateInputFileLine command. For example:

        CreateInputFileLine line \
             "Enter name of input reflection data file" \
             "Native" HKLIN DIR_HKLIN \
             -command "set_xtal_cell $arrayname"
    
  3. Define the callback function (in this case called set_xtal_cell) to collect the correct cell and perform any task-specific operations. The GetMtzDatasetFromLabel command is required to fetch the dataset that MTZ column belongs to, and the GetMtzParamFromDataset command to acquire a specific dataset-specific bit of information.

    For example:

       proc set_xtal_cell { arrayname } {
          # Get cell information based on HKLIN and F label
          upvar \#0 $arrayname array
    
          # Acquire actual values of task parameters
          set filen [GetFullFileName0 $arrayname HKLIN]
          set label [GetValue $arrayname "F"]
    
          # Extract xtal and dataset names, given filename and a column name
          if { ![GetMtzDatasetFromLabel $filen $label xtal dataset] } {
              return 0
          }
    
          # Get crystal cell for the specific crystal/dataset
          for { set i 1 } { $i <= 6 } { incr i } {
              if { ![GetMtzParamFromDataset $filen $xtal $dataset \
                     DCELL_$i dcell($i)] } {
                return 0
              }
          }
    
          # The required cell parameters are now in the local array "dcell"
          # so you can do what you want with them below here
          ...
          return
       }
    

Discussion

This recipe brings together solutions from the earlier recipes, in order to look up the dataset for a selected column and then use that to acquire the appropriate cell information. The solution provides a framework that allows the acquisition of the appropriate cell parameters. The remaining work is figuring out what to do with the cell once you have acquired it, which means writing the specifics of your set_xtal_cell equivalent.

This recipe could equally be applied to acquiring the appropriate wavelength associated with an MTZ column, by using the following code to set the value of the local variable dwave:

          if { ![GetMtzParamFromDataset $filen $xtal $dataset \
                 DWAVES dwave] } {
            return 0
          }

Something to note is that in some older or poorly assembled MTZ files, the dataset information may be missing. In this case be careful that your code can deal with the situation, for example by reverting to the global cell if the dataset association isn't found:

      # Extract xtal and dataset names, given filename and a column name
      if { ![GetMtzDatasetFromLabel $filen $label xtal dataset] } {
          # Use the global cell instead
          for { set i 1 } { $i <= 6 } { incr i } {
             if { ![GetMtzParam $filen CELL_$i dcell($i)] } {
                return 0
             }
          }
      }