CCP4 Interface: Validation & Deposition Module

	CCP4i: Graphical User Interface
	Validation & Deposition Module

Tasks in this module:

Validate Model and/or Data - Sfcheck, Procheck & Rampage

Run Whatcheck

Run Rotamer

Structure Factors for Deposition

Data Harvesting Manager

Run R500 - validate Remark 500 data in PDB file

The layout of each task window, i.e. the number of folders present, and whether these folders are open or closed by default, depends on the choices made in the Protocol folder of the task (see Introduction). Although certain folders are closed by default, there are specific reasons why you should or may want to look at them. These reasons are described in the Task Window Layout sections below.

Validate Model and/or Data - Sfcheck, Procheck & Rampage

This task should be used to validate your structure before attempting to deposit it with the PDB. See the entry in the refinement module for further details.

See program documentation: SFCHECK, PROCHECK, RAMPAGE.

Run WHATCHECK

This task should be used to validate your structure before attempting to deposit it with the PDB.

N.B. WHATCHECK is not distributed as part of CCP4. In order to be able to run it from CCP4i, the WHAT_CHECK program needs to be obtained separately; it is part of the WHAT_IF suite of programs.

Run ROTAMER

This task should be used to validate your structure before attempting to deposit it with the PDB. The ROTAMER task reads a protein coordinate file in PDB format and lists all amino acids whose side chain torsion angles deviate more than a user-defined threshold from the rotamers of the "Penultimate Rotamer Library".

See program documentation: ROTAMER.

Structure Factors for Deposition

This task uses the program MTZ2VARIOUS to convert structure factors from MTZ format to mmCIF, suitable for deposition.

You are given a list of columns from which you should choose those to be deposited. Usually, FP and its associated Sigma, and FreeR are sufficient. Additional columns are presented for selection if you indicate that there is anomalous data. The interface will try to give sensible defaults, but please check these carefully.

All columns will be output to the mmCIF file, but these can be flagged in a number of ways. The default scheme can be changed in the folder Options to flag reflections folder.

CIF files can contain multiple datasets (from multiple MTZ files), but there is no provision for the output of derivative data in the same data block as native data. Therefore, each dataset needs to be put in a different data block. However, the Interface (or rather the CCP4 program performing the transformation) only caters for one MTZ file and thus one data block. Each CIF data block must have a name which should begin with the characters 'data_'. The Interface will derive a name from the input MTZ filename, but you can change it. You can not have unlimited numbers of columns; the MTZ program labels are restricted to:

CIF label	MTZ program label
_refln.F_meas_au	FP (or F+ [and F-])
_refln.F_meas_sigma_au	SIGFP (or SIGF+ [and SIGF-])
_refln.F_calc	FC
_refln.phase_calc	PHIC
_refln.phase_meas	PHIB
_refln.fom	FOM
_refln.intensity_meas	I (or I+ [and I-])
_refln.intensity_sigma	SIGI (or SIGI+ [and SIGI-])

See program documentation: MTZ2VARIOUS.

For more information on mmCIF, see The mmCIF Home Page at the IUCr.

Data Harvesting Manager

The Data Harvesting Manager is a tool to manage and maintain any Harvest files produced by various CCP4 programs. It will run various tasks using these harvest files to prepare for deposition - it can check the validity of the harvest files, can check the consistency of the data between different harvest files of the same dataset, can convert these CIF harvest files into XML, and can run PDB_EXTRACT to extract additional information from log files and program output files for deposition.

See the full Program Documentation Data Harvesting Manager for more information.

Run R500 - validate Remark 500 data in PDB file

The R500 program is a utility that can be used to check the data used in the REMARK 500 lines of a PDB file, before submission to the Protein Databank. Various checks are performed and problems e.g. with atom names are flagged.

See program documentation: R500.

Valid XHTML 1.0! Valid CSS!