CCP4 Interface: Experimental Phasing Module

	CCP4i: Graphical User Interface
	Experimental Phasing Module

Background information is available on: Solution Files (HA files)

Tasks in this module:

Data Preparation

Combine Datasets - CAD

Scale and Analyse Datasets - SCALEIT and FHSCAL: Scale and Analyse Datasets - Task Window Layout

Prepare Data for HA Search - Revise, Ecalc, MTZ2various

Automated Search & Phasing: SHELX C/D/E - HA Search and Phasing; Crank - automated structure solution package; autoSHARP

Heavy Atom Location

Generate Patterson Map

Excluding Large Intensity Differences

Generate Patterson Map - Task Window Layout

ACORN - ab initio Phasing

Professs - NCS from HA

RANTAN - Direct Methods

SHELXS - Heavy Atom Search

Real Space Patterson Search - RSPS

Phasing & Refinement

BP3 - multivariate refinement and phasing

Phaser for Experimental Phasing

Run MLPHARE

Data Harvesting

Maps

ACORN - ab initio Phasing

Oasis - SAD/SIR phasing

Visualisation: Mapslicer for map sections

Utilities: Convert HA solution files to other formats; Phase analysis using Phasematch

Specialist Help is available on:: ScaleChoose - choosing the right scaling program for your datasets

The layout of each task window, i.e. the number of folders present, and whether these folders are open or closed by default, depends on the choices made in the Protocol folder of the task (see Introduction). Although certain folders are closed by default, there are specific reasons why you should or may want to look at them. These reasons are described in the Task Window Layout sections below.

Solution Files

Heavy Atom (.ha) Files

Heavy atom (HA) files are short files which keep a record of the proposed heavy atom sites in a structure. They are analogous to the MR files of the Molecular Replacement module. The format of the file is similar to the ATOM input line for the MLPHARE heavy atom refinement program. There is one line per atom site and the line is free format beginning with the word ATOM:

ATOM atom_name x y z occupancy anomalous_occupancy BFAC B-factor

The interface to MLPHARE can use an HA file as input and HA files are output by:

PEAKMAX program: using the OUTPUT FRAC option
RANTAN task: the files are actually generated by the PEAKMAX program which searches the maps generated from RANTAN's output phases
ACORN task: the files are actually generated by the PEAKMAX program which searches the maps generated from ACORN's output phases
Generate Patterson Map task: when the PEAKSEARCH option is on
Real Space Patterson Search (RSPS) task: when the site analysis option is used. The script for the task reads the RSPS program log file and extracts the sites information to HA files
MLPHARE task: after each refinement run the current refined coordinates are extracted from the log file to an HA file

HA files are generated with a default file name which is project_jobid_n.ha where n=1,2,3... . If you select an HA file from the menu under the View Files from Job button, it will be displayed in an HA file viewer which is similar to the MR file viewer and which has some simple functionality to edit the file. Picking a line in the file will put a # character at the beginning of a line and this line will then be ignored on input to MLPHARE. A second pick will remove the # character. There is a Change All button at the bottom of the viewer which will add or remove #'s from all ATOM lines. There is also an Edit Columns button which presents options to set the atom name, occupancy, anomalous occupancy and Bfactor for all the atoms in the file.

Combine Datasets - CAD

This task interfaces to the CAD program which can be used to:

Delete columns from an MTZ file
Merge data from two or more MTZ files
Reset the resolution range in the MTZ file(s)
Change the sort order or HKL limits or space group

To input more than one MTZ file, click on the Add input MTZ file button. By default all the data in the input MTZ file is put into the output file but you can change the Input option from 'all columns' to 'selected columns' and then select the columns using the Add column button. If you want to have the majority of the columns in the file, then click on the List All Columns button and then delete the columns you do not require using the Delete selected item option under the Edit list menu button. You will then need to select the column by clicking on one of the fields for that column with the right mouse button. See also Extending Frames and Toggle Frames.

CAD can not deal with more than 29 columns.

Do not include columns H, K and L in input. These are transferred to output automatically, and only upset the program.

Two special data types are used to signal that you are preparing data for translation functions of various types. They are:

U: partial FC
V: partial PHIC

There must be only one FCpart PHICpart per input file, and they must be the last items specified for LABIN. CAD generates equivalent reflections using only the ROTATIONAL part of the primitive symmetry operator (i.e. if the spacegroup is P212121, these reflections are analysed as though the spacegroup was P222). This is allowed for in the TFFC and RSEARCH programs.

CAD - Task Window Layout

Features to look out for in the CAD Task are:

Folder title	Importance	Comment
Files Folder	Add input MTZ file	To include more than one MTZ file
Define MTZ Output	override space group, cell dimensions, sort order, hkl limits etc.	can also be done with SFTOOLS

See program documentation: CAD, SFTOOLS

Scale and Analyse Datasets - SCALEIT and FHSCAL

For the scaling of derivative to native datasets, two CCP4 programs are available: SCALEIT and FHSCAL. The tutorial on isomorphous replacement by I. Tickle describes the strengths and weaknesses of those programs. Note that there is no unique solution to the problem of scaling together two different datasets. Various problems can arise from:

Scale Datasets with Anomalous Dispersion Data

The Scale Datasets task will run SCALEIT to scale together all the DPHn (the dispersive difference for the nth wavelength).

It will optionally do a cross-comparison of the anomalous data sets - this involves rerunning SCALEIT with the input:

LABIN FP = FPHn SIGFP = SIGFPHn FPH1 = FPHm SIGFPH1=SIGFPHm DPH1 = DPHm SIGDPH1= SIGDPHm

for all possible pairwise combinations of wavelengths n and m. From these runs, the cross-comparison Rfactor and normal probability for the acentric data are extracted.

It is also optional to perform analysis of dispersive differences by rerunning SCALEIT with the input:

LABIN FP = FPH(+)n SIGFP = SIGFPH(+)n FPH1 = FPH(-)n SIGFPH1= SIGFPH(-)n

From this analysis, the normal probabilities for the acentric and centric data and the Rfactor are extracted. The input MTZ file must contain the FPH(+)n and FPH(-)n. If you do not have data in this form, you should run the mtzMADmod program which converts DPHn to the appropriate form. This program is not interfaced. A better solution is to use the latest version of the TRUNCATE program which retains the FPH(+)n and FPH(-)n on output.

The results of both these analyses are tabulated in a summary file called project_jobid_scaleit.summary.

Scale Datasets - Task Window Layout

In the Protocol folder of the Scale Datasets task, you can choose:

analysis only - use SCALEIT without refinement
scale refinement using SCALEIT - use SCALEIT refinement
scale refinement using FHSCAL (Kraut's method) - use FHSCAL refinement
FHSCAL scale refinement & SCALEIT analysis - in effect, a combination of options 1 and 3
apply input scale factors - use SCALEIT with externally determined scale factors - not usually used

Features to look out for in the Scale Datasets Task are:

Protocol option	Folder title	Importance	Comment
1	Analysis	Graphs of differences between datasets	Analysis against resolution always performed.
2	Refinement Parameters	Apply Wilson scaling	Final Wilson scaling (affects scale factor only) after least-squares scaling (scale and temperature factors). See also Wilson.
3	Fhscal Scaling Parameters		Perform Kraut scaling with FHSCAL. In extreme cases, namely if the high resolution limit of the native dataset is lower than that of (one of) the derivatives, certain reflections may not get output. See also Caveat in FHSCAL program documentation.
4	Analysis	Analysis of FHSCAL results	SCALEIT ANALYSE is performed after scaling using FHSCAL (see protocol options 1 and 3).
5	Input Scaling Factors		Externally determined scales applied and analysis performed. No refinement. See also SCALE.

See program documentation: SCALEIT, FHSCAL.

Prepare Data for HA Search - Revise, Ecalc, MTZ2various

You will need to run this task for the following cases:

Input Data	Phasing Method
MAD	RANTAN or ACORN, SHELXS, RSPS, Anomalous Difference Patterson Maps
SAD	RANTAN or ACORN, SHELXS
SIR	RANTAN or ACORN, SHELXS

In the Prepare Data for HA Search task window you should only need to identify the type of your data and which phasing program you intend to run, and the interface will make the necessary conversions described below.

MAD data is rescaled by the REVISE program to give an estimate of the normalised anomalous scattering magnitude (given the column label FM by RANTAN and ACORN but sometimes referred to as FA in the literature). The input data can be in the form of F(+) and F(-) for each wavelength or be anomalous differences Dano for each wavelength. The output FM can then be used in similar fashion to a single anomalous difference (Dano) or isomorphous difference (Diso). The theory behind this is described in the REVISE program documentation.

Data conversion

Direct methods programs such as SHELXS, RANTAN and ACORN usually work with data in the form of normalised intensities rather than the structure factors which are normally used in macromolecular crystallography. So structure factor data must be converted to normalised structure amplitudes for use in direct methods programs. The SHELXS program has an internal procedure to do this conversion but data intended for RANTAN and ACORN must go through the ECALC program which calculates normalised structure amplitudes (usually given the column label E).

RANTAN, ACORN and all other CCP4 programs work with experimental data in MTZ file format but SHELXS requires the data in an ASCII format described in the SHELX documentation. The Prepare Data for HA Search task will use MTZ2VARIOUS to convert an MTZ file to SHELXS format.

See program documentation: REVISE, MTZ2VARIOUS, ECALC.

SHELX C/D/E - HA Search and Phasing

The SHELX C/D/E task uses George Sheldrick's SHELX suite of programs, specifically SHELXC, SHELXD and SHELXE, which can be obtained from THE SHELX HOMEPAGE. It also borrows heavily from ideas and design in the HKL2MAP interface developed by Thomas Pape and Thomas R. Schneider.

If you use the SHELX programs in your structure determination then please be sure to acknowledge their use:

SHELXD References:

Usón & Sheldrick (1999) Curr. Opin. Struct. Biol. 9 643-648
Sheldrick, Hauptman, Weeks, Miller & Usón (2001) International Tables for Crystallography Vol. F, eds. Arnold & Rossmann, pp. 333-351
Schneider & Sheldrick (2002) Acta Cryst. D58 1772-1779

SHELXE References:

Sheldrick (2002) Z. Kristallogr. 217 644-650

Overview

The SHELX C/D/E task offers a way of running the SHELX suite of programs as an automated "pipeline". It also allows the individual programs to be run on their own. The pipeline uses SHELXC to analyse the data and prepare input for the later stages of the procedure, SHELXD to search for heavy atom sites, and SHELXE to refine the sites and optionally to distinguish between the two possible enantiomorphs.

Note that if any of the required SHELX programs are not installed on your path then the interface cannot be launched from the main CCP4i window.

SHELXC Step: Data Analysis and Preparation

SHELXC prepares the data for the heavy atom search and refinement steps by calculating the F_A values and phase shifts α from experimentally measured reflection data. The user must provide information on the type of experiment and the format in which the reflection data will be provided.

SHELXC can deal with the following types of experiments:

MAD Experiment
Data collected from at two wavelengths, at least one of which must be either the peak or the inflection point data.
A native dataset can also be provided, however if no native data is available then SHELXC analyses the MAD datasets to generate "native" data - in which case you should switch on the option specifying that the native structure contains heavy atoms in the Phasing and Density Modification folder.
SAD Experiment
Data collected from a single wavelength is required.
SIR and SIRAS
Requires a native dataset and a dataset from the heavy atom derivative.

The SHELX C/D/E task allows the reflection data to be provided in a number of different formats:

Scalepack files with one file per dataset
Reflection files formatted as Scalepack can also be generated from the SCALA task.
A single MTZ file containing the data as intensities in the form of I(+) and I(-) pairs
In this case the interface generates Scalepack-formatted files for each dataset using the MTZ2VARIOUS program.
A single MTZ file containing the data as structure factor amplitudes in the form of F(+) and F(-) pairs
This is the least recommended form of input as the estimation of the intensities from the structure factor amplitudes involves approximations which may be detrimental to the final data.

The SHELXC step also requires the following information:

Spacegroup
Cell parameters

This will be extracted from the MTZ header if using MTZ input (and the Scalepack header if present), otherwise it must be entered manually. Note that SHELX doesn't recognise spaces in spacegroup names (which is different from the CCP4 programs), however CCP4i will attempt to convert spacegroup names to the correct format.

Output from SHELXC step

The SHELXC step outputs the following files:

Calculated native dataset in SHELX .hkl format
Calculated FA file in SHELX .hkl format
SHELXD input parameter file (".ins" file)

The .ins parameter file is automatically given a name which consists of the CCP4i project name and the job number (e.g. PROJECT_115_shelxc_fa.ins). These files are used as input to the SHELXD and SHELXE steps. If SHELXD is being run immediately after the SHELXC step then these files are passed on automatically to the heavy atom location.

The SHELX_CDE task generates a number of graphs from the output of SHELXC, which can be viewed using loggraph. Certain graphs are only generated for specific experiments:

Table	Graph	Experiment(s)	Comments
Analysis of <I/sig>, completeness ...	<I/sig> vs Resolution	All
	% Completeness vs Resolution	All
	<d'/sig> vs Resolution	SIRAS, SIR
	<d''/sig> vs Resolution	MAD, SAD, SIRAS	For SAD the high resolution cutoff can be estimated from where ΔF/σF falls below about 1.2.
Anomalous CC analysis	Anomalous CC versus Resolution	MAD	For MAD the high resolution cutoff can be estimated from finding where the correlation coefficient (CC) between the anomalous differences for wavelengths with the highest anomalous signal falls below about 30%.

SHELXD Step: Heavy Atom Location

The SHELXD step performs heavy atom location and takes the output from a SHELXC run as its input. If SHELXD follows on directly from SHELXC then the interface passes these on automatically, otherwise the user needs to specify them explicitly:

Calculated FA file in SHELX .hkl format
SHELXD input parameter file (".ins" file)

The interface allows the user to set a number of critical parameters which are required for the heavy atom location step:

Parameter	Comments
Number and type of heavy atoms to search for	The estimated number of sites should be within around 20% of the true number.
Number of attempts at locating the heavy atoms
Resolution cutoffs applied to the input data	The high resolution cutoff is a critical parameter for the heavy atom location, and it is recommended that the value calculated by SHELXC is used. This is written into the .ins file from SHELXC and picked up automatically by the task.
Minimum distance between heavy atoms	The most common user error is to keep the default value of 3.5Å even though the distances between heavy atoms are normally less than this value.
Allow for heavy atoms lying on special positions	Normally these sites are rejected by SHELXD

A note on running the SHELXD step without SHELXC

The interface allows the user to run the SHELXD step independently of SHELXC, by taking the output from a SHELXC run from a previous job. In this case the user can alter some of the values in the .ins file, as detailed in the table above. Note however that some of the values generated by SHELXC are dependent upon the above parameters (for example the value used by the UNIT command), and that therefore differences in output may occur if the SHELXD step is run without rerunning SHELXC first to update the generated parameters too.

Output from SHELXD step

The SHELXD step outputs the following files:

PDB file containing the best heavy atom solutions
Result file (.res) which is used as input to the phasing and density modification step (SHELXE)
Listing file (.lst)

The user can provide a name for the output heavy atom PDB file. The .res and .lst files are automatically given a name which consists of the CCP4i project name and the job number (e.g. PROJECT_115_shelxc_fa.ins).

If SHELXD doesn't write a .res file (because no CC value reached the target) then the task will terminate. Failure to generate a .res file usually means:

More tries are needed,
The resolution cutoff should be changed,
The sites are all on special positions but the option has not been set to allow this, or
There is no signal.

The SHELXD step produces a number of graphs:

Table	Graph	Comments
Occupancy for each site	Occupancy for each site	There should be a sharp drop in occupancy after the last true site. If the occupancy of the last site is more than 0.2 it is worth rerunning the heavy atom location and increasing the number of sites to search for.
CC All/Weak for each try	CC All/Weak per try

SHELXE Step: Phasing and Density Modification

The SHELXE step performs phasing and density modification using the results of the SHELXD and SHELXC steps, and produces calculated structure factors and phases which can be used to generate an initial electron density map.

If SHELXE follows on directly from SHELXC and SHELXD then the interface passes these on automatically, otherwise the user needs to specify them explicitly:

Running SHELXE standalone:
- HKL file with native data from SHELXC
- HKL file with FA data from SHELXC
- Result file (.res) from a previous SHELXD run
If running SHELXE directly after SHELXD:
- HKL file with native data from SHELXC

The following input parameters are available for the SHELXE step:

Parameter Comments Solvent Content This is the most critical parameter for SHELXE and it is worth varying the value in steps of 0.05 in order to maximise the difference in contrast between the two enantiomorphs (see below).
The solvent content can be estimated using the Matthews task, which in turn requires an estimate of the molecular weight of the protein. This may be experimentally measured, or else can be estimated from the number of residues or the sequence of the protein. Phase using original, inverted or both enantiomorphs The default option is to phase both enantiomorphs by running SHELXE twice in a single run of the task. The results can be compared to determine which is the correct hand (see below). Number of cycles of phasing and refinement It may be necessary to use several hundred cycles if the starting phase information is weak but the resolution is very high. For low resolution data using more than 20 cycles is normally counter-productive. Heavy atoms are present in the "native" structure Use this option e.g. for MAD data where there is no true native data.

Output from SHELXE step

The SHELXE step generates the following output:

A listing file .lst (automatically named by the task)
Protein phases in either MTZ format, or in XtalView .phs format.

For both XtalView and MTZ format output there is one file for each requested enantiomorph. Note that for some spacegroups SHELXE will convert to the inverted symmetry when phasing the inverted enantiomorph (e.g. from P4₁ to P4₃). For MTZ output the interface takes account of this inversion automatically.

The phases from SHELXE can be used to generate maps which can be viewed for example using the FFT task in CCP4i.

The SHELXE step produces a number of graphs:

Table	Graph	Comments
Contrast and Connectivity	Contrast versus Cycle	A big difference in the contrast of the two heavy atom enantiomorphs usually indicates a good SHELXE solution.
Contrast and Connectivity	Connectivity versus Cycle
Estimated CC(map)	Estimated CC(map) vs Resolution	A big difference in the correlation coefficient of the two heavy atom enantiomorphs usually indicates a good SHELXE solution.

Crank - automated structure solution package

Crank is an automated package for structure solution via experimental phasing. Currently it covers heavy atom location, heavy atom refinement and phasing, and density modification. Crank provides the automation, while existing programs do the individual calculations. Note that the CCP4 version is a cut-down version appropriate to CCP4 programs - the full Leiden version allows the use of alternative programs for the various steps.

In the protocol folder, select the procedure you wish to use, followed by the programs to use. You then need to specify an MTZ file containing the relevant datasets. Finally, there are a small number of required parameters to supply: the type and number of the heavy atoms, the number of protein and nucleotide residues in the asymmetric unit, and an estimate of the B-factor and solvent content. The latter can be estimated by the interface (click on "Calculate B and Solv. Content"), using the program Wilson to estimate the overall B-factor. The closed folders at the bottom of the interface allow access to the program-specific parameters for each step of the procedure.

See program documentation.

autoSHARP

SHARP/autoSHARP is software for the experimental phasing of macromolecular crystal structures. This interface enables autoSHARP jobs to be run from within CCP4i.

Note that SHARP/autoSHARP is not part of the CCP4 suite and must be acquired separately from Global Phasing.

See www.globalphasing.com.

Generate Patterson Map

The Generate Patterson Map Task performs the following:

Run SCALEIT to find an optimal cutoff for excluding reflections with suspiciously large differences
Run FFT PATTERSON in default sectioning mode to get first direction of map sections
Run MAPMASK to resection output map, to produce all necessary Harker sections
Run PEAKMAX to search maps for peaks and write these to the "Peak coord" file and to an HA file (see above)
Plot Harker sections with NPO

Optionally:

The user can give the coordinates of points to be plotted on the Patterson map
The user can give the coordinates of putative heavy atom sites and the VECTORS program is run to determine the predicted cross-vectors which are then plotted on the Patterson map

Excluding Large Intensity Differences

Erroneously large intensity differences can affect a Patterson map disproportionately because the parameter used, the intensity, is the square of the structure factor, and the square of a large number is a very large number. The effect seen in the Patterson map is ridges.

It is therefore usually a good idea to exclude the reflections with very high differences: FPH-FP from the difference Patterson and FPH+-FPH- from the anomalous difference Patterson. By default the Interface will run the SCALEIT program to analyse the data and use the value of 4.1*RMS(FPH-FP) which is a reasonable first estimate of a suitable cutoff. It may be worthwhile to try different cutoff values and look at the resultant Patterson map - the value used can be set at the top of the Exclude Reflections folder. Excluding 'good' reflections tends to degrade the map so it is not good to over-estimate the cutoff value. For very good data it may be unnecessary to exclude any data. The SCALEIT log file also has a table of Isomorphous and (if appropriate) Anomalous differences which show the number of reflections with given differences as a function of resolution shell.

Generate Patterson Map - Task Window Layout

Features to look out for in the Generate Patterson Map Task are:

Protocol option	Folder title	Importance	Comment
difference Patterson	Exclude Reflections	Exclude reflections with erroneously large (intensity) differences between F1 and F2 (i.e. `FPH` and `FP`)	see Excluding Large Intensity Differences
anomalous difference Patterson	Exclude Reflections	Exclude reflections with erroneously large (intensity) differences between F1 and F2 (i.e. `FPH+` and `FPH-`)	see Excluding Large Intensity Differences

See program documentation: SCALEIT, FFT, MAPMASK, PEAKMAX, NPO, VECTORS, HAVECS.

ACORN - ab initio Phasing at Atomic Resolution

ACORN is an ab initio procedure to solve a protein structure when atomic resolution data is available. In case of a structure containing heavy atoms, its procedures can be used for determination of anomalous scatterers from anomalous data where the resolution can be as low as 3Å to 4Å.

MAD data for ACORN must be preprocessed by the REVISE program (see above) which generates estimates of FM which is the normalised anomalous scattering factor. The input to REVISE is the FP and FPH(+)n and FPH(-)n for dataset n. These data should have been scaled by the SCALEIT program. REVISE also needs to know the wavelength, f' and f'' for each wavelength.

Acorn - Task Window Layout

Features to look out for in the Acorn Task are:

Protocol option	Folder title	Importance	Comment
search and phase with starting coordinates	ACORN-MR Parameters	Choose between a limited search with a POSItioned fragment, or a full ROTation Function and TRANslation function search
determine small molecule structure	General Acorn Parameters	Choose appropriate grid sampling	Grid sampling defaults to 1/3 of the high resolution limit which, in case of small molecule structures, is commonly around 1Å
search for heavy atom(s) at lower resolution		Separate window opens to 'Prepare Data for Experimental Phasing Programs'
search for heavy atom(s) at lower resolution	Selecting Data	Choose appropriate resolution limits

See program documentation: ACORN, ECALC, REVISE, SCALEIT.

Professs - NCS from HA

PROFESSS is a tool to help in the identification of NCS related atoms from a list of heavy atom positions. At the moment, PROFESSS only works with 'traditional' PDB files. HA files as produced by ACORN or RANTAN (for instance) can not be fed into PROFESSS - the HA file needs to be converted through the Convert Coordinate Formats task in the Coordinate Utilities module.

Professs - reading the output

The program first lists the triangles of atoms which it has found, then it analyses each pair of triangles as a possible NCS match. For each possible operator, a list of all matching atoms is given. For each pair of atoms, a 'loop factor' is listed. If the NCS operator is an N-fold rotation, the atom will be part of a 'loop' of N atoms (unless one is missing). This, along with an appropriate 3rd polar angle, can confirm the existence of a proper NCS operator.

Atoms are described by the atom serial number from the input PDB, along with 4 numbers listed in square brackets. The first of these is the number of the crystallographic symmetry operators, and the other three are the unit cell translations applied after the symmetry operator.

Professs - beware

When calculating the distance between a pair of atoms, all symmetry equivalents are considered, but only the cell repeat giving the least distance is considered. In a very few cases of low order crystallographic symmetry this may cause atoms to be missed.

RANTAN - Direct Methods

The RANTAN Direct Methods program can be applied to solving MAD data or isomorphous replacement data. The Interface will set the key input parameters appropriately for the type of data.

For isomorphous data, RANTAN works optimally with the input in the form of normalised amplitudes rather than structure factors so the Interface will usually run the ECALC program to convert SFs to normalised amplitudes. The Interface will alternatively allow input of either precalculated normalised amplitudes or normalised amplitudes and initial phases.

MAD data for RANTAN will be preprocessed by the REVISE program (see above) which generates estimates of FM which is the normalised anomalous scattering factor. The input to REVISE is the FP and FPH(+)n and FPH(-)n for dataset n. These data should have been scaled by the SCALEIT program. REVISE also needs to know the wavelength, f' and f'' for each wavelength.

See program documentation: RANTAN, ECALC, REVISE, SCALEIT.

SHELXS - Heavy Atom Search

The SHELX program can be obtained from THE SHELX HOMEPAGE. The CCP4i interface is for SHELXS-97. To ensure that CCP4i scripts can find the SHELX program, the full path name of the program needs to be entered in the Configure Interface window which is accessed from a button in the System Administration menu on the right hand side of the Main Window.

For more information on the SHELX programs, see THE SHELX HOMEPAGE. This has references to various FAQs: The SHELX Homepage; Frequently asked questions (macromolecules), and Thomas Schneider's FAQs.

Real Space Patterson Search - RSPS

RSPS is a grid search program that provides search options (to solve heavy atom derivatives) as well as interactive options for examining potential solutions (as a fit of potential sites to the difference Patterson map). All options operate in real and vector space. Searches can be performed to locate either heavy atom positions, or, under certain conditions, to locate the position of molecules with internal (NCS) symmetry. The goal of RSPS is not to generate a complete solution to the heavy atom difference Patterson, but rather to find enough sites to allow initial phases to be calculated for difference Fourier analysis.

Searches are carried out by assigning trial positions on a grid covering the asymmetric unit of the crystal, and then computing a score for each trial position, based on the Patterson densities at the positions corresponding to the predicted vectors for each position. From the symmetry operators (crystallographic and/or non-crystallographic) all unique transformations that map a point in real (crystal) space to a point in vector (Patterson) space are generated. In other words, these transformations map a point in real space to the Patterson vectors associated with that point.

BP3 - multivariate refinement and phasing

This task is a simple interface to the BP3 program for heavy atom refinement and phasing, using multivariate likelihood techniques. It requires datasets for native and derivatives supplied in the file HKLIN, and initial coordinates for the heavy atoms. Use the Add Crystal button to add details of all the crystals you have. For each crystal, you can add heavy atom coordinates, and also datasets recorded for the crystal.

If a heavy atom site occurs in more than one crystal, then select "Same site in more than one crystal", and Add Site.

Rather than inputting all the details manually, parameters can be entered as an XML file, e.g. as obtained from the output of the heavy atom location program Crunch2.

See program documentation.

Phaser - Experimental Phasing

Phaser is a program for phasing macromolecular crystal structures with maximum likelihood methods. This interface gives access to Phaser's functions for experimental phasing by single-wavelength anomalous diffraction (SAD), which optionally can exploit information from a partial (molecular replacement) model.

See program documentation.

Run MLPHARE

MLPHARE can be used to refine either isomorphous or anomalous data. Check the 'Use anomalous difference data' box at the top of the MLPHARE interface if appropriate. The initial default interface only provides for describing one derivative or wavelength; click on the Add Another Derivative button under the 'MTZ in' section to open space for additional data.

The minimal input then required is some initial heavy atom definitions in the folder Describe Derivatives & Refinement. For each derivative enter a name, and the name of the HA file containing the data for that derivative. Alternatively, enter the atoms explicitly by changing the Use data 'from file' menu option to Use data 'entered below' and then typing in the information. The Cut and Paste tool may be useful. For anomalous data you will need to enter the same HA file for each wavelength.

It is possible to edit the HA files 'on line' by clicking the View button on the file selection line. The HA file viewer has some simple editing tools but more complex changes may need to be done in an editor.

The output MTZ file contains columns PHIB_mlphare1, FOM_mlphare1 etc.. If you use this file as input to another MLPHARE run, set a new unique column name extension. Change the parameter 'Output label identifier' from mlphare1 to mlphare2 for instance. Each run of MLPHARE within the Interface also outputs one HA file for each derivative. These HA files can be used as input to the next MLPHARE run.

The SCALEIT documentation states: "MLPHARE has a built in weighting scheme which means that it doesn't do much harm to include less good data in phasing. After all the poor hkl should get low FOMs, and then DM can use the few reflections with reasonable phases to help in the phase extension procedure."

The MLPHARE program documentation has several helpful hints, e.g.: "NB: If an occupancy becomes near to 0.0 the coordinate shifts will possibly be meaningless", and a whole section of Notes on usage.

Suggested input numbers for Estimated Lack of Closure:

The program documentation suggests no input at all for the very first run.
The Interface has default 0.0 for all the numbers, even in the very first run.
Some people 'always' use a certain number (10% of F?!) in the very first run.

Data Harvesting

MLPHARE is one of the Data Harvesting programs. See Data Harvesting in CCP4i for implications for the Interface.

Maps

The MLPHARE interface has the option to output double difference maps which can be used to search for further heavy atoms. In this case the PEAKMAX program will also be run to list the peaks to a PDB file and to an HA file with the name project_jobid_label_peaks.ha where label is the MTZ column label of the derivative FPH. If you wish to do any other analysis on the map, it can be input to the 'Generate Patterson Map' task when the 'Run FFT ...' option at the top of the task window has been toggled off.

It is easiest to create maps by running the FFT task inside the Run Mlphare task. Do this by toggling on the option to 'Generate double difference maps files ...'.

In some cases it may be necessary to (re)create maps independently from the MLPHARE task. It is not possible to do this through the Create Task-Specific Maps task in the Map & Mask Utilities module. And only if you know exactly what you are doing should you attempt to do this through the Run FFT - Create Map task in the Map & Mask Utilities module.

See program documentation: MLPHARE, PEAKMAX, FFT.

ACORN - ab initio Phasing at Atomic Resolution

See the documentation on using the Acorn interface elsewhere in this document.

Oasis - SAD/SIR phasing

OASIS is a computer program for breaking phase ambiguity in One-wavelength Anomalous Scattering or Single Isomorphous Replacement (Substitution) protein data. The phase problem is reduced to a sign problem once the anomalous-scatterer or the replacing-heavy-atom sites are located. OASIS applies a direct method procedure to break the phase ambiguity intrinsic to OAS or SIR data.

Run Mapslicer - 2d plots of map sections

Mapslicer is an interactive viewer which displays 2d contoured sections through CCP4 map files, most usefully for seeing peaks in Patterson maps.

See program documentation.

Convert HA solution files to other formats

The program COORDCONV is used to convert coordinate files from various formats, including HA files into other suitable formats such as PDB format.

See program documentation: COORDCONV

Phase analysis using Phasematch

The Clipper Phasematch program can compare two sets of phases and provide appropriate analyses.

See also MIRTutorial(Bath) (the HTML equivalent of $CDOC/Iso_repl_itickle_tut.bath.ps),
Isomorphous Replacement (Birkbeck),
LLNL - Bernhard Rupp's Crystallographic Web Applets (containing an applet which calculates expected anomalous dispersion ratios),
Chooch (a program for calculating Anomalous Scattering Factors from X-ray fluorescence data).

Valid XHTML 1.0! Valid CSS!