mon_lib - multi-purpose dictionary for macromolecules
This dictionary can be use for several purposes: refinement, graphics, validation. The extended mmCIF format makes the dictionary self-explanatory, easy to adapt and to add new information.
The LIBCHECK program was developed to manage and check the information in the dictionary, and can also create new dictionary entries from different sources: PDB, CDS, CIF, SMILE.
MON_LIB defines:
All this information is constant, i.e. independent of the conformation of the molecule. A CIFile, describing a macromolecule, must contain the variable information (coordinates, occupancy, B-factors) and the list of modifications to and links between actual monomers.
The information about the names of chains and monomers, and the serial numbers of monomers for the links must be present in the CIFile or PDB file of coordinates.
The values for the amino acid bond lengths
and angles have been taken from Engh and Huber,
Acta Cryst. A47, 392-400 (1991).
The values for the purine and pyrimidine bond lengths
and angles have been taken from O. Kennard & R. Taylor (1982),
J. Am. Soc. Chem. vol. 104, pp. 3209-3212.
The values for the sugar-phosphate backbone bond lengths
and bond angles have been taken from the W. Saenger's "Principles
of Nucleic Acid Structure" (1983), Springer-Verlag, pp. 70,86.
Definition:
A monomer is a set of atoms connected by bonds, or a single atom.
For example, it may be an amino acid, a polypeptide chain, or a polypeptide chain and a substrate connected by hydrogen bonds.
It is useful for some programs (dynamics, graphics and so on) to define a tree-like structure of the monomers. The specification for each atom of the monomer of a 'back atom' and a 'forward atom' defines the tree-like structure of the monomer. The 'back atom' of a given atom is its preceding atom in the tree. If an atom has some forward branches, their order is
The following categories describe the monomers:
Let the vector direction atom_1 to atom_2 be v1, the vector direction atom_1 to atom_3 be v2, the vector direction atom_1 to atom_4 be v3; the chiral volume is the volume of the parallelepiped formed by the three vectors: v1,v2,v3. VOLUME = v1 . [ v2 x v3 ]The type is the sign of the chiral volume
Categories to describe the type of links between two atoms of different monomers. They describe only the type of the link, the atom names and the monomer flags. The information about the names of the chains and monomers, and the serial numbers of the monomers is given in the CIFile.
The monomer flag designates the first or the second monomer in the category _entity_link_ having its value given in the CIFile.
_entity_link_id _entity_link_entity_id_1 _entity_link_mon_id_1 _entity_link_set_num_1 _entity_link_entity_id_2 _entity_link_mon_id_2 _entity_link_set_num_2 SS Ach CYS 20 Cch CYS 40
The link "TRANS" is the default for polypeptide chain, "pd", the default for DNA, "pr", the default for RNA.
Categories to describe the possible modifications of monomers: the 'function' (add, delete, change), and the atoms, bonds, angles, chirality, planarity that will be modified. The information about the names and serial numbers of monomers is described in the CIFile. For example:
_entity_mod_id _entity_mod_entity_id _entity_mod_mon_id _entity_mod_set_num COO Cchain CYS 30
The modifications "NH3" and "COO" are the default for
the polypeptide chain termini.
The modifications "d5*END" and "d3*END" are the default for
DNA termini.
The modifications "r5*END" and "r3*END" are default for
RNA termini.
--- LIST OF MONOMERS --- data_comp_list loop_ _chem_comp.id _chem_comp.three_letter_code _chem_comp.name _chem_comp.group _chem_comp.number_atoms_all _chem_comp.number_atoms_nh . . . . CYS CYS 'CYSTINE ' L-peptide 10 6 . . . . --- DESCRIPTION OF MONOMERS --- data_comp_CYS loop_ _chem_comp_atom.comp_id _chem_comp_atom.atom_id _chem_comp_atom.type_symbol _chem_comp_atom.type_energy _chem_comp_atom.partial_charge CYS N N NH1 -0.204 CYS H H HNH1 0.204 CYS CA C CH1 0.058 CYS HA H HCH1 0.046 CYS CB C CH2 -0.096 CYS HB1 H HCH2 0.046 CYS HB2 H HCH2 0.058 CYS SG S S 0.004 CYS C C C 0.318 CYS O O O -0.422 loop_ _chem_comp_tree.comp_id _chem_comp_tree.atom_id _chem_comp_tree.atom_back _chem_comp_tree.atom_forward _chem_comp_tree.connect_type CYS N n/a CA START CYS H N . . CYS CA N C . CYS HA CA . . CYS CB CA SG . CYS HB1 CB . . CYS HB2 CB . . CYS SG CB . . CYS C CA . END CYS O C . . loop_ _chem_comp_bond.comp_id _chem_comp_bond.atom_id_1 _chem_comp_bond.atom_id_2 _chem_comp_bond.type _chem_comp_bond.value_dist _chem_comp_bond.value_dist_esd CYS N H coval 0.860 0.020 CYS N CA coval 1.458 0.019 CYS CA HA coval 0.980 0.020 CYS CA CB coval 1.530 0.020 CYS CB HB1 coval 0.970 0.020 CYS CB HB2 coval 0.970 0.020 CYS CB SG coval 1.808 0.023 CYS CA C coval 1.525 0.021 CYS C O coval 1.231 0.020 loop_ _chem_comp_angle.comp_id _chem_comp_angle.atom_id_1 _chem_comp_angle.atom_id_2 _chem_comp_angle.atom_id_3 _chem_comp_angle.value_angle _chem_comp_angle.value_angle_esd CYS H N CA 114.000 3.000 CYS HA CA CB 109.000 3.000 CYS CB CA C 110.100 1.900 CYS HA CA C 109.000 3.000 CYS N CA HA 110.000 3.000 CYS N CA CB 110.500 1.700 CYS HB1 CB HB2 110.000 3.000 CYS HB2 CB SG 108.000 3.000 CYS HB1 CB SG 108.000 3.000 CYS CA CB HB1 109.000 3.000 CYS CA CB HB2 109.000 3.000 CYS CA CB SG 114.400 2.300 CYS N CA C 111.200 2.800 CYS CA C O 120.800 1.700 loop_ _chem_comp_tor.comp_id _chem_comp_tor.id _chem_comp_tor.atom_id_1 _chem_comp_tor.atom_id_2 _chem_comp_tor.atom_id_3 _chem_comp_tor.atom_id_4 _chem_comp_tor.value_angle _chem_comp_tor.value_angle_esd _chem_comp_tor.period CYS chi1 N CA CB SG 0.000 15.000 3 loop_ _chem_comp_chir.comp_id _chem_comp_chir.id _chem_comp_chir.atom_id_centre _chem_comp_chir.atom_id_1 _chem_comp_chir.atom_id_2 _chem_comp_chir.atom_id_3 _chem_comp_chir.volume_sign CYS chir_01 CA N CB C negativ
--- LIST OF MODIFICATIONS --- data_mod_list loop_ _chem_mod.id _chem_mod.name _chem_mod.comp_id _chem_mod.group_id . . . . COO COO-terminus . peptide . . . . --- DESCRIPTION OF MODIFICATIONS --- data_mod_COO loop_ _chem_mod_atom.mod_id _chem_mod_atom.function _chem_mod_atom.atom_id _chem_mod_atom.new_atom_id _chem_mod_atom.new_type_symbol _chem_mod_atom.new_type_energy _chem_mod_atom.new_partial_charge COO change C C . C 0.340 COO change O O . OC -0.350 COO add . OXT O OC -0.350 loop_ _chem_mod_tree.mod_id _chem_mod_tree.function _chem_mod_tree.atom_id _chem_mod_tree.atom_back _chem_mod_tree.atom_forward _chem_mod_tree.connect_type COO add OXT C . END COO change C . OXT . loop_ _chem_mod_bond.mod_id _chem_mod_bond.function _chem_mod_bond.atom_id_1 _chem_mod_bond.atom_id_2 _chem_mod_bond.new_type _chem_mod_bond.new_value_dist _chem_mod_bond.new_value_dist_esd COO change C O coval 1.231 0.020 COO add C OXT coval 1.231 0.020 loop_ _chem_mod_angle.mod_id _chem_mod_angle.function _chem_mod_angle.atom_id_1 _chem_mod_angle.atom_id_2 _chem_mod_angle.atom_id_3 _chem_mod_angle.new_value_angle _chem_mod_angle.new_value_angle_esd COO change CA C O 121.000 3.000 COO add CA C OXT 121.000 3.000 loop_ _chem_mod_tor.mod_id _chem_mod_tor.function _chem_mod_tor.id _chem_mod_tor.atom_id_1 _chem_mod_tor.atom_id_2 _chem_mod_tor.atom_id_3 _chem_mod_tor.atom_id_4 _chem_mod_tor.new_value_angle _chem_mod_tor.new_value_angle_esd _chem_mod_tor.new_period COO add psi N CA C OXT 160.00 30.0 2 loop_ _chem_mod_plane_atom.mod_id _chem_mod_plane_atom.function _chem_mod_plane_atom.plane_id _chem_mod_plane_atom.atom_id _chem_mod_plane_atom.new_dist_esd COO add oxt C 0.020 COO add oxt CA 0.020 COO add oxt O 0.020 COO add oxt OXT 0.020
--- LIST OF LINKS --- data_link_list loop_ _chem_link.id _chem_link.name _chem_link.comp_id_1 _chem_link.mod_id_1 _chem_link.group_comp_1 _chem_link.comp_id_2 _chem_link.mod_id_2 _chem_link.group_comp_2 . . . . SS SS-bridge CYS . . CYS . . TRANS default-peptide-link . . peptide . . peptide . . . . --- DESCRIPTION OF LINKS --- data_link_SS loop_ _chem_link_bond.link_id _chem_link_bond.atom_1_comp_id _chem_link_bond.atom_id_1 _chem_link_bond.atom_2_comp_id _chem_link_bond.atom_id_2 _chem_link_bond.type _chem_link_bond.value_dist _chem_link_bond.value_dist_esd SS 1 SG 2 SG disulf 2.031 0.020 loop_ _chem_link_angle.link_id _chem_link_angle.atom_1_comp_id _chem_link_angle.atom_id_1 _chem_link_angle.atom_2_comp_id _chem_link_angle.atom_id_2 _chem_link_angle.atom_3_comp_id _chem_link_angle.atom_id_3 _chem_link_angle.value_angle _chem_link_angle.value_angle_esd SS 1 CB 1 SG 2 SG 110.000 3.000 SS 1 SG 2 SG 2 CB 110.000 3.000 loop_ _chem_link_tor.link_id _chem_link_tor.id _chem_link_tor.atom_1_comp_id _chem_link_tor.atom_id_1 _chem_link_tor.atom_2_comp_id _chem_link_tor.atom_id_2 _chem_link_tor.atom_3_comp_id _chem_link_tor.atom_id_3 _chem_link_tor.atom_4_comp_id _chem_link_tor.atom_id_4 _chem_link_tor.value_angle _chem_link_tor.value_angle_esd _chem_link_tor.period SS ss 1 CB 1 SG 2 SG 2 CB 90.00 10.0 2 data_link_TRANS loop_ _chem_link_bond.link_id _chem_link_bond.atom_1_comp_id _chem_link_bond.atom_id_1 _chem_link_bond.atom_2_comp_id _chem_link_bond.atom_id_2 _chem_link_bond.type _chem_link_bond.value_dist _chem_link_bond.value_dist_esd TRANS 1 C 2 N coval 1.329 0.014 loop_ _chem_link_angle.link_id _chem_link_angle.atom_1_comp_id _chem_link_angle.atom_id_1 _chem_link_angle.atom_2_comp_id _chem_link_angle.atom_id_2 _chem_link_angle.atom_3_comp_id _chem_link_angle.atom_id_3 _chem_link_angle.value_angle _chem_link_angle.value_angle_esd TRANS 1 O 1 C 2 N 123.000 1.600 TRANS 1 CA 1 C 2 N 116.200 2.000 TRANS 1 C 2 N 2 H 124.300 3.000 TRANS 1 C 2 N 2 CA 121.700 1.800 loop_ _chem_link_tor.link_id _chem_link_tor.id _chem_link_tor.atom_1_comp_id _chem_link_tor.atom_id_1 _chem_link_tor.atom_2_comp_id _chem_link_tor.atom_id_2 _chem_link_tor.atom_3_comp_id _chem_link_tor.atom_id_3 _chem_link_tor.atom_4_comp_id _chem_link_tor.atom_id_4 _chem_link_tor.value_angle _chem_link_tor.value_angle_esd _chem_link_tor.period TRANS psi 1 N 1 CA 1 C 2 N 160.00 30.0 2 TRANS omega 1 CA 1 C 2 N 2 CA 180.00 10.0 0 TRANS . 1 CA 1 C 2 N 2 H 0.00 10.0 0 TRANS phi 1 C 2 N 2 CA 2 C 60.00 20.0 3 loop_ _chem_link_plane.link_id _chem_link_plane.plane_id _chem_link_plane.atom_comp_id _chem_link_plane.atom_id _chem_link_plane.dist_esd TRANS plane1 1 CA 0.02 TRANS plane1 1 C 0.02 TRANS plane1 1 O 0.02 TRANS plane1 2 N 0.02 TRANS plane2 1 C 0.02 TRANS plane2 2 N 0.02 TRANS plane2 2 CA 0.02 TRANS plane2 2 H 0.02
--------------------------------------------------- ener_lib.cif 4-APR-95 ---------------------------------------------------
----------- Description of atom type -------- HEADER C Carbon PI ATOMTYPE CSP = with triple bond HEADER C Carbon SP2 ATOMTYPE C = without hydrogen ( carbonyl C ) ATOMTYPE C1 = connected to 1 hydrogen ATOMTYPE C2 = connected to 2 hydrogens ATOMTYPE CR1 = between two pyrrole units ATOMTYPE CR1H = CR1 connected to 1 hydrogen ( CHA of HEME ) ATOMTYPE CR15 = connected to 1 hydrogen in 5 atoms ring ( CE1 of HIS) ATOMTYPE CR16 = connected to 1 hydrogen in 6 atoms ring ( CE1 of PHE) ATOMTYPE CR6 = without hydrogen in 6 atoms ring ATOMTYPE CR5 = without hydrogen in 5 atoms ring ATOMTYPE CR56 = between two atoms in 5-6 rings ( CD2 CE2 of TRP ) ATOMTYPE CR55 = between two atoms in 5-5 rings ATOMTYPE CR66 = between two atoms in 6-6 rings HEADER C Carbon SP3 ATOMTYPE CH1 = connected to 1 hydrogen ( CA of most amono acids ) ATOMTYPE CH2 = connected to 2 hydrogens ( CB of most amono acids) ATOMTYPE CH3 = connected to 3 hydrogens ( CD1 CD2 of LEUCINE) ATOMTYPE CT = without hydrogen HEADER H Hydrogen ATOMTYPE HCH = hydrogen of aliphatic group ATOMTYPE HCR = hydrogen of aromatic group ATOMTYPE HNC1 = hydrogen connected to NC1 ATOMTYPE HNC2 = hydrogen connected to NC2 ATOMTYPE HNC3 = hydrogen connected to NC3 ATOMTYPE HNH1 = hydrogen connected to NH1 ATOMTYPE HNH2 = hydrogen connected to NH2 ATOMTYPE HNR5 = hydrogen connected to NR15 ATOMTYPE HNR6 = hydrogen connected to NR16 ATOMTYPE HOH1 = hydrogen connected to OH1 ATOMTYPE HOH2 = hydrogen of water ATOMTYPE HSH1 = hydrogen of sulphur HEADER N Nitrogen PI ATOMTYPE NS = without hydrogen ( triple bond ) ATOMTYPE NS1 = connected to 1 hydrogen HEADER N Nitrogen SP2 ATOMTYPE N = without hydrogen ( N of PRO ) ATOMTYPE NC1 = connected to 1 hyd. in a charged group ( NE of ARG ) ATOMTYPE NC2 = connected to 2 hyd. in a charged group ( NH2 of ARG ) ATOMTYPE NH1 = connected to 1 hydrogen ( N of main chain ) ATOMTYPE NH2 = connected to 2 hydrogen ( NE2 of GLU ) ATOMTYPE NPA = without hydrogen ( NA and NC of HEME ) ATOMTYPE NPB = without hydrogen ( NB and ND of HEME ) ATOMTYPE NRD5 = without hydrogen but with electronic doublet in 5 atoms ring ATOMTYPE NRD6 = without hydrogen but with electronic doublet in 6 atoms ring ATOMTYPE NR15 = connected to 1 hyd. in 5 atoms ring ( ND1 of HIS ) ATOMTYPE NR16 = connected to 1 hyd. in 6 atoms ring ATOMTYPE NR5 = connected to 3 atoms in 5 atoms ring ( N9 of ADE ) ATOMTYPE NR6 = connected to 3 atoms in 6 atoms ring ( N1 of CYT ) HEADER N Nitrogen SP3 ATOMTYPE NT = without hydrogen ATOMTYPE NT1 = connected to 1 hydrogen ATOMTYPE NT2 = connected to 2 hydrogens ATOMTYPE NT3 = connected to 3 hydrogens HEADER O Oxygen SP2 ATOMTYPE O = without NET charge ( O of main chain ) ATOMTYPE OC = with a NET charge ( OE1 OE2 of GLU ) ATOMTYPE OP = with a NET charge connected to P (O1P of phosphate group ) ATOMTYPE OS = with a NET charge connected to S ( O1 of sulphate group ) ATOMTYPE OB = with a NET charge connected to B HEADER O Oxygen SP3 ATOMTYPE O2 = connected to 2 atoms ( O4' of ribose ) ATOMTYPE OC2 = with a NET charge connected to 2 ATOMS ( O3' of ribose ) ATOMTYPE OH1 = oxygen of alcohol groups ( OG1 of THR ) ATOMTYPE OH2 = oxygen of water ATOMTYPE OHA = oxygen of water in MO6 ATOMTYPE OHB = oxygen of water in MO6 ATOMTYPE OHC = oxygen of water in MO6 HEADER S Sulphur ATOMTYPE S = sulphur without hydrogen ATOMTYPE SH1 = sulphur with a hydrogen ( SG of CYS ) HEADER Fe ATOMTYPE FE = iron HEADER P ATOMTYPE P = phosphorus HEADER Zn ATOMTYPE ZN = zinc END --- ATOM --- loop_ _lib_atom.type _lib_atom.weight _lib_atom.hb_type _lib_atom.vdw_radius _lib_atom.vdwh_radius _lib_atom.ion_radius _lib_atom.element _lib_atom.valency _type atomic chemical type _weight atomic weight _hb_type donor/acceptor type: N=neither D=donor A=acceptor B=both H=hydrogen candidate to hydrogen bonding _vdw_radius Van der Waals radius _vdwh_radius Van der Waals radius for atom+H Ionic radii for most of the atoms without hydrogens are: WebElements or Chemistry of the elements by Greenwood and Earnshaw VDW radii of carbon atoms with hydrogen have been taken from Li and Nussinov, Proteins, 32 111-127 (1998) CSP 12.01150 N 1.700 1.700 . C 4 C 12.01150 N 1.700 1.750 . C 4 C1 12.01150 N 1.700 1.820 . C 4 C2 12.01150 N 1.700 1.800 . C 4 . . . . .
--- BONDS --- loop_ _lib_bond.atom_type_1 _lib_bond.atom_type_2 _lib_bond.type _lib_bond.const _lib_bond.length _atom_type atomic chemical type _const constant KBOND _value equilibrium length of this bond BOND0 BOND - actual bond length ENERGY = KBOND * ( BOND - BOND0 )**2 Values for bond distances and sigmas are from (where it is possible): International tables for crystallography Volume C, 1992, Edited by AJC Wilson Published for IUCr by Kluwer Academic Publishers, Dordrecht/Boston/London For carbon-carbon etc Section: Typical Interatomic distances: Organic compounds Authors: FH Allen, O Kennard, DG Watson, L Brammer, AG Orpen and R Taylor pages: 685-706 For metal radii and distances. Section: Typical interatomic distances: Organometallic Compounds and Coordination complexes of the d- and f-block metals Authors: AG Orpen, L Brammer, FH Allen, O Kennard, DG Watson and R Taylor pages: 707-791 C C single 420.0 1.550 0.025 C C double 420.0 1.330 0.020 . . . . .
--- ANGLES --- loop_ _lib_angle.atom_type_1 _lib_angle.atom_type_2 _lib_angle.atom_type_3 _lib_angle.const _lib_angle.value _atom_type atomic chemical type _const constant KTHETA _value equilibrium value for the angle THETA0 THETA - actual angle ENERGY = KTHETA * ( THETA - THETA0 )**2 NS CSP CH3 . 180.000 NS CSP SH1 . 180.000 . . . . . . .
--- TORSIONS --- loop_ _lib_tors.atom_type_1 _lib_tors.atom_type_2 _lib_tors.atom_type_3 _lib_tors.atom_type_4 _lib_tors.label _lib_tors.const _lib_tors.angle _lib_tors.period _atom_type atomic chemical type _const Constant KPHI _period NPH number of minima in the function _angle target angle DELTA PHI - actual angle ENERGY = KPHI * ( 1 - COS( NPH * (PHI - DELTA) ) ) . CH1 NH1 . . 0.000 0.000 3 . C CH1 . . 0.000 180.000 3 . . . . . . ..
--- VDW contacts --- loop_ _lib_vdw.atom_type_1 _lib_vdw.atom_type_2 _lib_vdw.energy_min _lib_vdw.radius_min _lib_vdw.H_flag _atom_type atomic chemical type _energy_min EPSij minimum of energy parameter _radius_min Rmin radius of the minimum of energy parameter Rij - actual distance _H_flag "h" - the parameters for atoms with hydrogens Lennard-Jones potential ENERGY = EPSij * ( (Rmin/Rij)**12 - 2 * (Rmin/Rij)**6 ) C C -0.19686 3.600 . C C -0.19686 3.600 h . . . . .
--- H-BONDS --- loop_ _lib_hbond.atom_type_1 _lib_hbond.atom_type_2 _lib_hbond.min _lib_hbond.dist _atom_type atomic chemical type _hbond_min EPSHij energy at minimum _hbond_dist RHmin distance at minimum of energy RHij - actual distance ENERGY = EPSHij * ( (RHmin/RHij)**12 - 2 * (RHmin/RHij)**10 ) NRD5 NH1 -1.500 2.850 NRD5 NT3 -1.500 2.850 . . . . . .
- reads library of monomers, gives information about some monomer
- creates PostScript file with picture and information about bonds, angles, ...
- can read additional library, combine two libraries and write to a new library file
- can create description of new monomer reading coordinates from PDB file of CIFile
Authors: A.Vagin and G.Murshudov, E.Dodson, K.Henrick, J.Richelle, S.Wodak.