CCP4 Tutorial: Session 3 - Heavy Atom Search and Phasing by MIR

See also the accompanying document giving background information.

In the following instructions, when you need to type something, or click on something, it will be shown in red. Output from the programs or text from the interface is given in green.

Outline of the method

Scaling and analysing datasets
Finding first heavy atom sites by Patterson Search
Find heavy atoms by direct methods or automated Patterson Search
Heavy atom refinement for first derivative
Find heavy atoms for other derivatives using difference Fouriers phased from first derivative
Final Refinement and Phasing
Correcting the hand
Improving the density

The Data Files

Files in directory DATA:

rnase25.mtz	An MTZ file containing experimental data (including anomalous), extending to 2.5Å, for the native protein and three derivatives (mercury, platinum and iodine), as used for experimental phasing by MIR and experimental phasing by MAD
rnase25_scaleit1.mtz	Reflection file as output by SCALEIT

Files in directory RESULTS:

scaleit-rnase.log	.log of scaling RNase native and derivative data
shelx-rnase.log	.log of heavy atom search with SHELX (through CCP4i)

3a) Scaling and analysing datasets

The Problem

There is a file $CEXAM/tutorial/data/rnase25.mtz which contains native data, plus three derivatives; Hg, Pt and I with their anomalous signals. First, we scale each derivative to the native dataset, so that all data is on the same scale. At the same time, we analyse the heavy atom data to estimate the strength of the signals.

Exercise

Select the Experimental Phasing module, and open the Scale and Analyse Datasets task window.
On the first line, enter a suitable job title such as:

Job title Scaling RNase SA Derivatives (mir tutorial step 1).
On the second line, select

Do scale refinement using Scaleit.

On the next line, select

Include anomalous difference data for each derivative

On the next line, select

Perform cross-comparison of derivative data sets

and de-select

and analyse anomalous differences

using the radiobuttons.
Select the input MTZ file

MTZ in DATA rnase25.mtz

Now select the columns from the MTZ file. The first line has the native FNAT and SIGFNAT. Then select columns for the 3 derivatives, using the button Add Derivative Data to add more columns. You should end up with:

FP	FNAT	SigmaFP	SIGFNAT
FPH1	FHG2	SigFPH1	SIGFHG2
DPH1	DANOHG2	SigDPH1	SIGDANOHG2
FPH2	FPTNCD25	SigFPH2	SIGFPTNCD25
DPH2	DANOPTNCD25	SigDPH2	SIGDANOPTNCD25
FPH3	FIOD25	SigFPH3	SIGFIOD25
DPH3	DANOIOD25	SigDPH3	SIGDANOIOD25

Check that the output MTZ file is given as

MTZ out TEST rnase25_scaleit1.mtz

You should not need to change anything else. Select Run -> Run Now.
When the job has finished, return to the main window, highlight the job in the Job List, and select View Files from Job -> View Log Graphs. This task outputs a large number of graphs for analysing the data, and we will just look at some of them.
We can gauge the strength of the isomorphous differences by looking at the graphs:

Centric Normal probability v resolution and

Acentric Normal probability v resolution ...

for each pair of wavelengths, e.g. ... FP = FNAT FPH = FHG2 SIGFHG2 , FPTNCD25 etc DANOHG2 SIGDANOHG2. For each graph, look at the line Gradient_on_reflection_prob.lt.0.9. Use the crosswires to estimate a rough value, e.g. for the native against the Hg derivative, the value is about 2.5 for centric data and 2.05 for acentric data.

The values can be summarised as (these values are contained in the file View Files from Job -> ...scaleit.summary):

 Table: Normal Probability for acentric data 
 
 Normal Prob.   |FNAT        FHG2        FPTNCD25    FIOD25 
 ---------------------------------------------------------------- 
 FNAT           |            2.051       9.458       10.873 
 FHG2           |2.051                   3.124       3.804 
 FPTNCD25       |9.458       3.124                   11.484 
 FIOD25         |10.873      3.804       11.484 
 
 Table: Normal Probability for Centric data 
 
 Normal Prob.   |FNAT        FHG2        FPTNCD25    FIOD25 
 ---------------------------------------------------------------- 
 FNAT           |            2.521       9.517       8.939 
 FHG2           |2.521                   3.482       4.042 
 FPTNCD25       |9.517       3.482                   9.946 
 FIOD25         |8.939       4.042       9.946

This shows that the isomorphous difference (i.e. difference between native and derivative) is smallest for the Hg derivative, and largest for the Iodine derivative.

3b) Finding the first heavy atom sites by Patterson Search

The Problem

Before carrying out any experimental phasing it is necessary to know the atomic coordinates of the heavy atoms. We do this using isomorphous or anomalous differences. The isomorphous difference is a component of FH, and the anomalous difference is a component of 2F"H.

For the Hg derivative there is 1 Hg atom to be positioned.

For the Pt derivative there are 4 Pt atoms to be positioned.

For the I derivative there are 2 I atoms to be positioned.

One site can be found quite easily by a Patterson search. For four sites, a Patterson search is quite complicated. However it is always good practice to calculate Pattersons using both the isomorphous and anomalous for each derivative. Each pair should show a similar pattern of peaks.

Exercise

Select the Experimental Phasing module, and open the Generate Patterson Map task window.
On the first line, enter a suitable job title such as

Job title RNase Hg isomorphous Patterson 10 to 3.5A (mir tutorial step 100).
On the next line, select

Run FFT to generate difference Patterson

then select with the radio button

Plot default Harker map sections with coordinates of peaks in map
Select the input MTZ file

MTZ in TEST rnase25_scaleit1.mtz.

(If you do not have this file from the previous session, take the file from the DATA directory.)

Now select the columns from the MTZ file:

F1 FHG2 SIG1 SIGFHG2

F2 FNAT SIG2 SIGFNAT

Check that the output MAP file is given a sensible name

Map TEST rnase25-HGDISOpatterson1.map
It is VERY important to exclude outliers which are often due to measurement errors.

In the folder Exclude Reflections:

The Exclude Reflections with difference between F1 and F2 > ? will be estimated from a scaleit analysis if you switch on the radio button, or you can enter your own value.

It is sensible to always exclude reflections with F less than n * sigmaF where n is 3 (for all data concerned).

You also need to select a suitable resolution limit. Use plots of the Analysis of data vs. resolution to select sensible limits found in the scaleit run (View Files from Job -> View Log Graphs); here enter

Resolution less than 10 Å or greater than 3.5 Å.
You should not need to change anything else, so select Run -> Run Now.
The Harker sections will be plotted - click on View Files from Job -> jobid...plt. It is a good idea to compare these plots for the dispersive and anomalous Pattersons. They should have a similar pattern of peaks.
Now also generate an anomalous Patterson Map using DANOHG2, using the same MTZ file. Enter a meaningful job title.

Job title RNase Hg anomalous Patterson 10 to 3.5A (mir tutorial step 107)
On the next line, select

Run FFT to generate Patterson using anom diff (D) data

MTZ in TEST rnase25_scaleit1.mtz

Now select the columns from the MTZ file:

AnomDif DANOHG2 SigmaD SIGDANOHG2
Check that the output MAP file is given as

Map TEST rnase25-HGDANOpatterson1.map
You should not need to change anything else, so select Run -> Run Now.
The Harker sections will be plotted - click on View Files from Job -> jobid...plt. It is a good idea to compare these plots for the isomorphous and anomalous Pattersons. They should have a similar pattern of peaks. Have a look at RNase Harkers for pictures, comparison and discussion. The outcome of this is a heavy atom at x~±0.1±1/2, y~±0.1±1/2, z~±0.2±1/2.

F1	FHG2	SIG1	SIGFHG2
F2	FNAT	SIG2	SIGFNAT

3c) Find heavy atoms by direct methods or automated Patterson Search

You are now going to use a direct methods approach for locating the Hg sites. This is more useful (and more successful) when there are many sites. In this section, you will prepare the Isomorphous data for use in the direct methods program SHELX.

Exercise

Select the Experimental Phasing module, and open the Prepare Data for HA Search task window.
On the first line, enter a suitable job title such as

Job title Run SHELX for Hg isomorphous data.
On the next line, select

Input SIR data and prepare data for SHELXD
Select the input MTZ file

MTZ in TEST rnase25_scaleit1.mtz

Now select the columns from the MTZ file:

FP FNAT SigFP SIGFNAT

FPH FHG2 SigFPH SIGFHG2
Check that the output SHELX hkl file is given as

Output HKL TEST rnase25_scaleit1HGDISO.hkl
Again it is VERY important to exclude outliers due to measurement errors. Use the same criteria as you chose for the Patterson. In the folder Exclude data when converting to Shelx format, use the radio buttons and fill in the following:

Resolution less than 10.0 Angstrom or greater than 3.5 Angstrom

FP less than n * sigmaF where n is 3.0

FPH less than n * sigmaFH where n is 3.0

Difference between F1 and F2 greater than 272.07

The value of 272.07 can be found in the log file of the difference Patterson (see above); search log file for "reflections excluded" and adapt as you see fit.
Select Run -> Run Now.
When the data preparation has finished, open the ShelxS - Heavy Atom Search task window.
On the first line, enter a suitable job title such as

Job title Shelx heavy atom search RNase locating Hg (mir tutorial step 208)
On the next line, select

Try to find heavy atoms by direct methods

And the next line should read

Input format is Shelx hkl file data is intensities
Select the input HKL and MTZ files

HKL in TEST rnase25_scaleit1HGDISO.hkl

MTZ in TEST rnase25_scaleit1.mtz
Most of the Cell Parameters folder should be filled in automatically. It now only needs the desired number of heavy atoms to find, and its/their type:

Search for 1 atoms of HG
No other parameters need to change, so select Run -> Run Now.
When the job has finished, return to the main window, highlight the job in the Job List, and select View Files from Job -> View Log File. Near the end of the log file, find:
```
Heavy-atom assignments: 
 
          x       y       z    s.o.f.   Height 
 
 HG1   0.8945  0.0947  0.8000  1.0000   418.3
```
Comparing this with the outcome of the Patterson map calculations above:

x_HG1 = -x_Patt (+ whole cell shift)
y_HG1 = y_Patt
z_HG1 = -z_Patt (+ whole cell shift)
Experience has shown that, for situations where only one or a few heavy atoms are sought, Shelx 'PATT' (CCP4i Protocol option 'Patterson search' rather than 'direct methods') may perform more reliably. For the mercury search in RNase, however, this does not seem to be the case. Try it, if you like. No other options need to change, although Try 4 superposition vectors (Shelx Patterson Search Parameters folder) sometimes helps.

FP	FNAT	SigFP	SIGFNAT
FPH	FHG2	SigFPH	SIGFHG2

3d) Heavy atom refinement for first derivative

The Problem

We now have an initial solution for 1 Hg site. Heavy atom refinement and phasing is done using the program MLPHARE. We will calculate cross-peak and difference map(s) to start looking for more heavy atoms.

Exercise

Stage 1. Refine Hg solution and search difference maps for more peaks.

Select the Experimental Phasing module, and open the Run Mlphare task window.
On the first line, enter a suitable job title such as

Job title Refining Hg reso 10-3.5A XYZ occ centric (mir tutorial step 300).
In the Protocol folder, select:

Use centric data only (you may have to switch off the 'Use anomalous difference data' option first).

There are enough centric observations to refine the Hg parameters.

Select

Output cross-peaks map(s)

Generate difference maps and do peak search for more heavy atoms

Select the input MTZ file:

MTZ in TEST rnase25_scaleit1.mtz

Now select the columns from the MTZ file.

FP	FNAT	SigmaFP	SIGFNAT
FPH1	FHG2	SigFPH1	SIGFHG2
Choose other derivatives for cross-peaks maps (Use 'Add Another Cross-Peak Derivative' for this)
FPH1	FPTNCD25	SigFPH1	SIGFPTNCD25
FPH2	FIOD25	SigFPH2	SIGFIOD25

Check that the output MTZ file has a sensible name, especially if you intend to save intermediate versions of it.
Enter a memorable MTZ output column label identifier, such as:

Output label identifier mlp-hg1
In the folder Key Parameters, select and enter:

Resolution limit from 10.0 to 3.5
In the folder Describe Derivatives & Refinement, select and enter:

Phase with&refine derivative One HG

Use isomorphous data to refine XYZ and refine occupancy

Use heavy atom site data entered below

HG 0.8945 0.0947 0.8 20.0 1.0
Select Run -> Run Now.

When the job has finished, return to the main window, highlight the job in the Job List, and select View Files from Job. In the list of output files, note the following:

TEST_jobnumber_1.ha	refined Hg parameters
TEST_jobnumber_FHG2.map	difference map calculated with coefficients FHG2-FNAT and phases from the mlphare run refining Hg parameters
TEST_jobnumber_FHG2_peaks.pdb	peaks found in difference map, can be viewed with graphics program
TEST_jobnumber_FHG2.ha	peaks from the difference map, in fractional coordinates, to be re-used by CCP4i
TEST_jobnumber_FPTNCD25.map	difference map ('cross-peaks map') calculated with coefficients FPTNCD25-FNAT and phases from Hg
TEST_jobnumber_FPTNCD25_peaks.pdb	as for HG derivative
TEST_jobnumber_FPTNCD25.ha	as for HG derivative
TEST_jobnmuber_FIOD25...	as for PT derivative

Select View Files from Job -> TEST_jobnumber_1.ha. The occupancy of the Hg site has refined to around 0.4. This is respectable.
Select View Files from Job -> View Log Graphs. Graphs are given for the last refinement cycle and the final phasing cycle. Look in particular at:

Lack of closure analysis .... / Phasing power ...., Lack of closure analysis .... / Cullis Rfactor ....

For good data and a good derivative, the phasing power should be greater than 1, and the Cullis Rfactor should be significantly less than one. From the first graph, it can be seen that for one Hg site and just the centric data, the phasing power is not great. However, the structure was solved by this method, so on to the next stage.

Stage 2. Refine all plausible-looking Hg sites (coordinates and occupancy) and search for more peaks in all derivatives.

In the Run Mlphare task window, adapt the job title:

Job title Refining 4 poss Hg reso 10-3.5A XYZ occ centric (mir tutorial step 320)
In the Files folder, adapt the MTZ output filename if you wish. Re-using the same is not a problem in most cases. Adapt the column label for the output:

Output label identifier mlp-hg4
In the folder Describe Derivatives & Refinement, select and enter:

Phase with&refine derivative Four HG

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FHG2.ha

Then click the View button to check and edit the ...FHG2.ha file.

In the .ha File Viewer, click Change all. This results in hashes (#) at the beginning of each line. CCP4i ignores any ATOM lines in .ha files which start with a hash, so remove the hash at the beginning of lines we want to keep through a click on the first four atom lines. Then click the Edit Columns button and enter:

Set atom names HG

Set occupancies 0.2

and click OK

Back in the .ha File Viewer, click Save&Exit.
Select Run -> Run Now. If you have left the name of the output MTZ file as it was, you will have to delete the old version of it.
When the job has finished, View Log Graphs. The phasing power for the centrics is now above 1 for some of the resolution ranges, and the Cullis Rfactor is coming down a little.

3e) Find heavy atoms for other derivatives

Including heavy atoms for other derivatives in the refinement can be a process of trial-and-error. It may be necessary to take a step back and remove sites to try others. Proposed stages are:

Phase on Pt derivative peaks found from the Hg phasing, and refine their coordinate and occupancy parameters.
Phasing and refining (coordinate and occupancy parameters) all convincing Hg and Pt sites.
Include I derivative sites found from the phasing of the best derivative or both other derivatives.

Some tips:

Remove any incorrect sites. Their occupancy will refine to a small or negative value. These may/will be obvious after refining against the centric data only; there are enough centric observations to refine the Hg parameters, and the occupancies. If you have no centric data use the Use every XXX-th reflection for refinement option to use a subset of the reflections for refinement.
Look for new atom sites using Fourier difference (or double difference) maps. These can be calculated using the very poor phase estimates calculated from the Hg sites after the preliminary centric refinement. The output MTZ file has phases for all the data to the requested resolution limit. A difference Fourier will contain the present sites plus potential new sites; a double difference Fourier will only show potential new sites (and may therefore be more difficult to interpret). The peak height for a new site will be positive, but probably only 20-30% of the height of sites included in the refinement. Beware: these maps are VERY noisy. Then add any new sites, and repeat the refinement and difference Fourier calculations.
MLPHARE provides useful graphs to monitor your progress. The most useful are those labelled: Lack of Closure analysis v resln and Anomalous lack of closure v resln. The Cullis Rfactor and the Anomalous Cullis Rfactor should be less than 1, and any "improvement" to the sub-structure should reduce them. The Phasing power should increase as the solution improves.
When you are satisfied that you have the best Hg solution, use the phases derived from the Hg sites to calculate difference Fouriers for the Pt and I derivatives; select likely Pt and I sites and refine them against their centric differences, and finally combine all three derivatives to generate the final phases.

To do this, you need to run MLPHARE several times. Steps using centric data (or a subset of the full data set) will be very fast. The heavy atom parameters are held in a .ha file, which is updated after each pass. The output MTZ file will be used as input for the difference Fouriers.

Exercise

Stage 1a. Phase with and refine potential Pt sites. The first peak found for Pt is actually in exactly the same position as the original Hg solution. At this stage it cannot be determined whether this is an artefact of the Mlphare refinement, or a real solution for Pt. It is therefore best to exclude it from phasing and refinement.

In the Run Mlphare task window, adapt the job title:

Job title Refining 2 poss Pt reso 10-3.5A XYZ occ centric (mir tutorial step 400)
In the Files folder, select the input MTZ file column labels according to the following:

FP FNAT SigmaFP SIGFNAT

FPH1 FPTNCD25 SigFPH1 SIGFPTNCD25

Choose other derivatives for cross-peaks maps

FPH1 FHG2 SigFPH1 SIGFHG2

FPH2 FIOD25 SigFPH2 SIGFIOD25

Adapt the MTZ output filename if you wish. Re-using the same is not a problem in most cases. Adapt the column label for the output:

Output label identifier mlp-pt2
In the folder Describe Derivatives & Refinement, select and enter:

Phase with&refine derivative Two PT from HG

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FPTNCD25.ha

Then click the View button to be able to edit the ...FPTNCD25.ha file. If all is well, ATOM1 will have coordinates of x~0.4, y~0.4, z~0.2. This is also the position of the first Hg solution, so to be safe, this should not be used as a Pt site in the first instance.

In the .ha File Viewer, click Change all. Then click on the lines beginning ATOM2 and ATOM3 (i.e. the two highest scoring ones but not the one in the Hg position). These can now be used to phase and refine. Click the Edit Columns button and enter:

Set atom names PT

Set occupancies 0.2

and click OK

Back in the .ha File Viewer, click Save&Exit.
Select Run -> Run Now. If you have left the name of the output MTZ file as it was, you will have to delete the old version of it.
When the job has finished, View Log Graphs. The phasing power and Cullis Rfactor for the two Pt sites are better than those for the four Hg sites. This is very promising. Also view the file TEST_jobnumber_FPTNCD25.ha. The peak for the 'Hg site', which was not included in phasing and refinement, comes up high in the list and is therefore almost certainly a genuine Pt site, too. It can be included in the next stage.

Stage 1b. Phase with and refine all plausible-looking Pt sites.

In the Run Mlphare task window, adapt the job title:

Job title Refining 4 poss Pt reso 10-3.5A XYZ occ centric (mir tutorial step 410)
In the Files folder, adapt the MTZ output filename if you wish. Re-using the same is not a problem in most cases. Adapt the column label for the output:

Output label identifier mlp-pt4
In the folder Describe Derivatives & Refinement, select and enter:

Phase with&refine derivative Four PT

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FPTNCD25.ha

Then click the View button to be able to edit the ...FPTNCD25.ha file.

In the .ha File Viewer, click Change all and then the first four ATOM lines (which should include the 'Hg site'). Click the Edit Columns button and enter:

Set atom names PT

Set occupancies 0.3

and click OK

Back in the .ha File Viewer, click Save&Exit.
Select Run -> Run Now. If you have left the name of the output MTZ file as it was, you will have to delete the old version of it.
When the job has finished, View Log Graphs. The phasing power and Cullis Rfactor for the Pt derivative have improved considerably. From the ...FPTNCD25.ha file it can be seen that there is a fifth probable Pt site. It is worth considering whether to include that in the next stage.

Stage 2. Phase with and refine Pt and Hg sites.

In the Run Mlphare task window, adapt the job title:

Job title Refining 4Pt and 3Hg reso 10-3.5A XYZ occ centric (mir tutorial step 420)

In the Files folder, select the input MTZ file column labels according to the following:

FP	FNAT	SigmaFP	SIGFNAT
FPH1	FPTNCD25	SigFPH1	SIGFPTNCD25
FPH2	FHG2	SigFPH2	SIGFHG2	(Use 'Add another derivative')
Choose other derivatives for cross-peaks maps (Use 'Edit list -> Delete selected item', and RIGHT mouse button on any widget in the HG2 line)
FPH1	FIOD25	SigFPH1	SIGFIOD25

Adapt the MTZ output filename if you wish. Re-using the same is not a problem in most cases. Adapt the column label for the output:

Output label identifier mlp-pt4-hg3

In the folder Describe Derivatives & Refinement, subfolder Derivative number 1, select and enter:

Phase with&refine derivative Four PT

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_1.ha from the previous job, where 4 Pt sites were refined.

In subfolder Derivative number 2, select and enter:

Phase with&refine derivative Three HG

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_1.ha from the most recent job in which Hg sites were refined.

For the HG .ha file, click the View button. Click on the lines concerned with ATOM4 (ATOM4 and ATREF below it), because the occupancy refined to an unbelievably low value (0.049). Then click Save&Exit.
Select Run -> Run Now. If you have left the name of the output MTZ file as it was, you will have to delete the old version of it.
When the job has finished, View Log Graphs. The phasing powers and Cullis Rfactors have not improved (if anything, they have deteriorated) compared to the phasing and refinement runs for the derivatives separately. Checking the refined Hg parameters against those calculated from the difference map, something seems to have gone awry. The first three peaks of the difference map do not match the three Hg sites from the refinement. Time to re-run with the three highest peaks from the difference map.

Stage 2b. Phase with and refine 4 previously refined Pt sites and 3 different Hg sites from the difference map.

In the Run Mlphare task window, adapt the job title:

Job title Refining 4Pt and 3differentHg reso 10-3.5A XYZ occ centric (mir tutorial step 430)
Adapt the MTZ output filename if you wish. Re-using the same is not a problem in most cases. Adapt the column label for the output:

Output label identifier mlp-pt4-hgdiff3
In the folder Describe Derivatives & Refinement, subfolder Derivative number 1, select and enter:

Phase with&refine derivative Four PT

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_1.ha from the previous job, where 4 Pt sites were refined.

In subfolder Derivative number 2, select and enter:

Phase with&refine derivative Three different HG

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FHG2.ha from the previous job, from the difference map calculations.

For the HG .ha file, click the View button. In the .ha File Viewer, click Change all and then the first three ATOM lines. Click the Edit Columns button and enter:

Set atom names HG

Set occupancies 0.2

and click OK

Back in the .ha File Viewer, click Save&Exit.
Select Run -> Run Now. If you have left the name of the output MTZ file as it was, you will have to delete the old version of it.
When the job has finished, View Log Graphs. The phasing power and Cullis Rfactor for the Pt derivative have improved marginally. However, the statistics for the Hg derivative have improved markedly. Time to include the iodine derivative.

Stage 3a. Include the iodine derivative. Take peaks from the difference map calculated with phases from the best derivative (Pt, i.e. from the end of stage 1) only, or from both derivatives together (i.e. from the end of stage 2).

In the Run Mlphare task window, adapt the job title:

Job title Refining 4Pt 3Hg 3I reso 10-3.5A XYZ occ centric (mir tutorial step 440)
Un-select the option to output cross-peaks map(s).
In the Files folder, select the input MTZ file column labels according to the following:

FP FNAT SigmaFP SIGFNAT

FPH1 FPTNCD25 SigFPH1 SIGFPTNCD25

FPH2 FHG2 SigFPH2 SIGFHG2

FPH3 FIOD25 SigFPH3 SIGFIOD25 (Use 'Add another derivative')

Adapt the MTZ output filename if you wish. Re-using the same is not a problem in most cases. Adapt the column label for the output:

Output label identifier mlp-pt4-hg3-i3
In the folder Describe Derivatives & Refinement, subfolder Derivative number 1, select and enter:

Phase with&refine derivative Four PT

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_1.ha from the previous job, where 4 Pt sites were refined.

In subfolder Derivative number 2, select and enter:

Phase with&refine derivative Three HG

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_2.ha from the previous job, where 3 Hg sites were refined.

In subfolder Derivative number 3, select and enter:

Phase with&refine derivative Three I

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FIOD25.ha from the previous job, from the difference map calculations.

For the I .ha file, click the View button. In the .ha File Viewer, click Change all and then the first three ATOM lines. Click the Edit Columns button and enter:

Set atom names I

Set occupancies 0.2

and click OK

Back in the .ha File Viewer, click Save&Exit.
Select Run -> Run Now. If you have left the name of the output MTZ file as it was, you will have to delete the old version of it.
When the job has finished, View Log Graphs. The statistics for Pt and Hg are still improving slightly, and the iodine derivative looks promising. Investigate possible improvements from inclusion of more sites for any of the derivatives.

Stage 3b. [Optional] The heavy atom structure is now probably complete enough for final refinement and phasing. The statistics may, however, improve with the inclusion of more sites for any or all of the derivatives. Investigate. Be aware of symmetry-related sites and 'shoulders'.

Try including 5 Pt, 4 Hg, 4 I from their respective difference maps. Edit the occupancies to a slightly lower value of what they refined to in the previous run. If the occupancy refines back to the previous (higher) value, this gives an additional check on the validity of the site. Suggested values: 0.4 for PT, 0.3 for HG and I.
When the job has finished, View Log Graphs and compare with the graphs from the previous run (4 Pt, 3 Hg, 3 I). Also, check the refined values for the occupancies of new and previously included heavy atom sites, and whether the refined sites are at the top of the peak list from the difference map calculations. The fourth Hg site does not refine to a satisfactory value, and the statistics for the Hg derivative do not improve. The fifth Pt and fourth I, however, improve statistics and refine to a satisfactory occupancy, and the first four Pt and first three I sites behave as hoped.
Try taking out the worst Hg site again. Use the refined sites from step 450 as input, without re-setting occupancies.
When the job has finished, View Log Graphs, refined sites and difference map peak lists as before. The refinement is now stable, and no new sites suggest themselves for inclusion.

3f) Final Refinement and Phasing

The Problem

Phasing and refining (coordinates, real occupancy and anomalous occupancy parameters), against all data, including anomalous.

Exercise

In the Run Mlphare task window, enter a suitable job title such as:

Job title Refinement against all data (mir tutorial step 500).
In the first folder:

First de-select Use centric data only

Then select Use anomalous difference data

Generate difference maps and do peak search for more heavy atoms

In the Files folder, select the input MTZ file:

MTZ in TEST rnase25_scaleit1.mtz

Now select the columns from the MTZ file.

FP	FNAT	SigmaFP	SIGFNAT
FPH1	FPTNCD25	SigFPH1	SIGFPTNCD25
DPH1	DANOPTNCD25	SigDPH1	SIGDANOPTNCD25
FPH2	FHG2	SigFPH2	SIGFHG2
DPH2	DANOHG2	SigDPH2	SIGDANOHG2
FPH3	FIOD25	SigFPH3	SIGFIOD25
DPH3	DANOIOD25	SigDPH3	SIGDANOIOD25

Check that the output MTZ file is given as

Output MTZ TEST rnase25_mlphare2.mtz

Enter a memorable MTZ output column label identifier, such as:

Output label identifier mlp-all-anom
In the folder Key parameters, enter resolution limits to include all the data between 50 and 2.5Å:

Resolution limit from 50.0 to 2.5.
In the folder Describe Derivatives & Refinement, subfolder Derivative Number 1, select and enter:

Phase with&refine derivative Five PT anom

Use isomorphous data to refine XYZ and refine occupancy Refine anom occ

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_1.ha from the previous job, where 5 Pt sites were refined.

In subfolder Derivative number 2, select and enter:

Phase with&refine derivative Three HG anom

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_2.ha from the previous job, where 3 Hg sites were refined.

In subfolder Derivative number 3, select and enter:

Phase with&refine derivative Four I anom

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_3.ha from the previous job, where 4 I sites were refined.

For all three .ha files, click the View button, then the Edit Columns button and enter:

Set anomalous occupancies 0.1

and click OK

Back in the .ha File Viewer, click Save&Exit.
Select Run -> Run Now.
When the job has finished, return to the main window, highlight the job in the Job List, and select View Files from Job -> View Log Graphs. The statistics look good for all three derivatives. Then look at Anomalous lack of closure analysis .... / Ano Cullis Rfactor ..... For good data the anomalous Cullis Rfactor should be significantly less than one. However, none of the three derivatives has particularly good data (and the Hg derivative is the worst of the three). This explains why the anomalous Patterson maps are (almost) uninterpretable.

Also, look at the refined sites (in the ..._1.ha, ..._2.ha and ..._3.ha files). All (or nearly all) anomalous occupancies have refined to a negative value. This means that the refinement has been performed on the wrong hand.

3g) Correcting the hand

The Problem

The procedure for locating the Hg sites cannot distinguish between a particular set of sites and the same set of sites transformed through a point of inversion, i.e. it cannot distinguish the hand of the solution. Therefore, the previous phasing run should be repeated using the opposite hand.

Exercise

In the Run Mlphare task window, adapt the job title:

Job title Refinement against all data correcting hand (mir tutorial step 600).
Make sure to generate difference maps, which will be used later to shift the heavy atom coordinates to a 'friendly' asymmetric unit.
Use the MTZ file and column selection from the previous run (step 500).
Choose a suitable name for the output MTZ file, reflecting the different hand, such as:

MTZ out TEST rnase25_scaleit2h.mtz

It should be no problem to re-use the output column label identifier.
For the derivative descriptions, start from the same set as for step 500.

For the Pt .ha file, click the View button. In the .ha File Viewer, click the Reverse hand button. A small 'Change Hand' window will appear. Our spacegroup is P212121, so there is no need to type the spacegroup name. Just click OK. Back in the File Viewer window, all coordinates have changed to negative values. Click Save As.. and change the output filename to reflect the change of hand (e.g. TEST_jobnumber_1h.ha). Back in the File Viewer, click Quit. Back with the description for Derivative number 1 (Five Pt anom), click the Browse button and select the .ha file just saved (TEST_jobnumber_1h.ha).

Repeat this for the Hg and I .ha files (i.e. create ..._2h.ha and ..._3h.ha respectively, and select those for Derivative number 2 and 3).
Select Run -> Run Now.
When the job has finished, View Log Graphs. The Anomalous Cullis Rfactor has improved, at least for the Pt and I derivatives. All but one (one of the iodine sites) of the anomalous occupancies have refined to a value higher than the starting value of 0.1.

Comparing TEST_jobnumber_1.ha with TEST_jobnumber_FPTNCD25.ha, it should be easy to see how to transfer the heavy atom sites to a 'friendly' asymmetric unit.
[Optional] It is easy to transfer the heavy atom sites to an asymmetric unit with all positive coordinates (which fits with the refined coordinates as they can be found in the Protein Data Bank).

In the Run Mlphare task window, adapt the job title:

Job title Transferring ha coordinates to friendly asymmetric unit (mir tutorial step 606)
Leave all MTZ input and output data as before.
In the folder Describe Derivatives & Refinement, subfolder Derivative number 1, select and enter:

Phase with&refine derivative Five PT positive

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FPTNCD25.ha from the previous job, from the difference map calculations.

In subfolder Derivative number 2, select and enter:

Phase with&refine derivative Three HG positive

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FHG2.ha from the previous job, from the difference map calculations.

In subfolder Derivative number 3, select and enter:

Phase with&refine derivative Four I positive

Use heavy atom site data from HA file

HA in TEST TEST_jobnumber_FIOD25.ha from the previous job, from the difference map calculations.

Edit all three .ha files such that only the desired sites are included for refinement, and give appropriate names and occupancies. Suggested values:

atom sites occupancy anomalous occupancy

PT first 5 0.4 0.3

HG first 3 0.3 0.2

I first 4 0.4 0.3

This still includes the low-occupancy iodine site.
Select Run -> Run Now.
When the job has finished, check everything is in order. The low-occupancy iodine site seems to be just that.

3h) Improving the density

Before going on to model building and refinement, the phases can be improved through density modification/improvement. For this, the program 'dm' is used.

Exercise

Select the Density Improvement module, and open the Run DM
On the first line, enter a suitable job title such as

Job title DM on MIR phases (mir tutorial step 700).
Select Input Hendrickson-Lattman coefficients
Select the input MTZ file:

MTZ in TEST rnase25_mlphare2h.mtz

Now select the columns from the MTZ file.

FP FNAT SIGFP SIGFNAT

PHIO PHIB_mlp-all-anom Weight FOM_mlp-all-anom

HLA HLA_mlp-all-anom HLB HLB_mlp-all-anom

HLC HLC_mlp-all-anom HLD HLD_mlp-all-anom
In the Required Parameters folder, enter the solvent content as

Fraction solvent content 0.46.
Everything else can be left as default, so Run -> Run Now.
When the job has finished, look at the log file and loggraph to check statistics and solvent boundaries. To really appreciate the results, it would be best to calculate maps (one with the phases from just MLPHARE, one with phases from DM) and compare.

FP	FNAT	SIGFP	SIGFNAT
PHIO	PHIB_mlp-all-anom	Weight	FOM_mlp-all-anom
HLA	HLA_mlp-all-anom	HLB	HLB_mlp-all-anom
HLC	HLC_mlp-all-anom	HLD	HLD_mlp-all-anom

On to the next tutorial - Experimental Phasing (by MAD).

Back to the previous tutorial - Data Processing and Reduction.

Back to the index.

Prepared by Liz Potterton and Martyn Winn, 2000

Adapted by Eleanor Dodson and Maria Turkenburg, 2002-2003

atom	sites	occupancy	anomalous occupancy
PT	first 5	0.4	0.3
HG	first 3	0.3	0.2
I	first 4	0.4	0.3