FindGeo Help Introduction

FindGeo is used to determine the coordination geometry of metals in complexes and proteins for which a PDB structure is available. It works by finding the best superposition between the atoms that coordinate the metal in the input structure and the atoms that coordinate the metal in structural templates with idealized geometries. The RMSD calculated over the superimposed structures provides a quantitative measure of the similarity between the configuration of the coordinating ligands and the various geometries possible for the metal coordination number, which can then be ranked to identify the best geometry assignment.

Quick start

In this short guide, numbers in parentheses refer to red boxes in the Figures.

  1. Specify the input structure
    The structure can be input in the following two ways: Enter a PDB code into the “PDB Code” box (1). The structure will be automatically retrieved from the PDB database. Select a PDB file in your local disk, either by entering the file name into the “PDB File” box or by browsing the disk (2).
  2. Search for metals
    Click on the “Search for metals” button (3), and FindGeo will automatically identify all the metals present in the input structure and, for each metal, the atoms that coordinate the metal. If you want, before clicking on the “Search for metals” button you can: Change the default value of 2.8 Å in the “Threshold coordination distance” box (4). This is the distance from the metal below which atoms are considered to coordinate the metal. Change the default list of atoms in the “Excluded donor atoms” box (5). This is the list of atom types that are not identified as atoms that coordinate the metal, whatever their distance from the metal. Enter the chemical symbol of a specific metal into the “Metal of interest (optional)” box (6). Only that metal will be searched for in the input structure.
  3. Run the calculation
    Select the metals for which you want to determine the coordination geometry by checking the corresponding tick boxes in the “Select” column (7). One or more metals can be selected. You can also select all the metals in the list by checking the tick box next to the “Select” column header (8). To help in metal selection, you can sort the list alphabetically either by metal (using the arrows next to the “Metal” column header) (9) or by coordinating atoms (using the arrows next to the “Ligands” column header) (10). Also, you can filter the list to show only items that match a specific string by entering that string into the “Search” box (11) (e.g., if you want to see only metals coordinated by a histidine, you can enter “HIS”). Click on the “Run” button (12).
  4. See (and download) the results
    For each metal, FindGeo reports the best geometry assignment (in the “Geometry” column) (13), a ball-and-stick representation of the assigned geometry (“GeoImage” column) (14), and the RMSD between the input structure and the structural template with idealized geometry that produced the best superposition (15). You can sort the results by metal (using the arrows next to the “Metal” column header) (16), by coordinating atoms (using the arrows next to the “Ligands” column header) (17), or by RMSD value (using the arrows next to the “RMSD” column header) (18). Also, you can filter the results to show only items that match a specific string by entering that string into the “Search” box (19). For each metal, you can download output files (see “Output formats” for more details) by clicking on the arrow in the “Download” column (20).


Output formats

The output files produced by FindGeo for each metal are:

  • A PDB file called findgeo.input with the coordinates of the metal and of the atoms that coordinate the metal, as extracted from the input PDB file. In this file the metal is always in the first row, and the atoms that coordinate the metal are listed according to their order in the input PDB file. An example of this file is as follows:

    HETATM 2029 ZN    ZN A 262      -6.677  -1.626  15.471  1.00  5.67          ZN
    ATOM    722  NE2 HIS A  94      -7.486  -0.334  13.493  1.00  5.58           N
    ATOM    920  ND1 HIS A 119      -8.295  -2.377  16.601  1.00  4.43           N
    ATOM    743  NE2 HIS A  96      -5.890  -3.503  14.906  1.00  4.07           N
    
  • A text file called findgeo.out with a summary of the results, showing the RMSD calculated for all the geometries tested for the metal and the best geometry assignment. An example of this file is as follows:
    ------------------------------------------------------------------------------------------------
    Geometries tested                                                            |      Tag|    RMSD
    ------------------------------------------------------------------------------------------------
    tri - TRIGONAL PLANE                                                         |Irregular|   1.186
    tev - TETRAHEDRON WITH A VACANCY                                             |  Regular|   0.281
    spv - SQUARE PLANE WITH A VACANCY                                            |Irregular|   1.454
    ------------------------------------------------------------------------------------------------
    Best geometry of the site: tev (regular)
    
  • Three files for each tested geometry, which are called (i) [geometry].out, (ii) [geometry].pdb, and (iii) [geometry]_orig.pdb, being [geometry] a three-letter code (see “The FindGeo library of geometries” below for more details).
    • The [geometry].out file contains the coordinates of the ideal structural template for [geometry], and the coordinates of the metal and of the atoms that coordinate the metal after superposition to the ideal template. For the purpose of superposition, the distances between the metal and the atoms that coordinate the metal in the input structure are all set to 3.0 Å. In addition, the RMSD values for each atom and the total RMSD is reported. An example of this file is as follows:
      -----------------------------------------------
      Coordinates of template with ideal tev geometry
      -----------------------------------------------
         0.000   0.000   0.000
         1.732  -1.732  -1.732
        -1.732  -1.732   1.732
         1.732   1.732   1.732
      ---------------------------------------------
      Coordinates of the fitted metal site and RMSD
      ---------------------------------------------
         0.000   0.000   0.000   0.000
         1.744  -1.759  -1.692   0.050
        -1.550  -1.617   1.996   0.340
         1.607   1.555   2.000   0.345
      --------------------
      Total RMSD:   0.281
      --------------------
      
    • The [geometry].pdb file is a PDB file with the input structure superimposed to the ideal structural template for [geometry]. For the purpose of superposition, the distances between the metal and the atoms that coordinate the metal in the input structure are all set to 3.0 Å. An example of this file is as follows:
      ATOM      1  M   TEV     1       0.000   0.000   0.000
      ATOM      2  L   TEV     1       1.732  -1.732  -1.732
      ATOM      3  L   TEV     1      -1.732  -1.732   1.732
      ATOM      4  L   TEV     1       1.732   1.732   1.732
      TER
      ATOM   2029 ZN    ZN A 262       0.000   0.000   0.000
      ATOM    722  NE2 HIS A  94       1.744  -1.759  -1.692
      ATOM    920  ND1 HIS A 119      -1.550  -1.617   1.996
      ATOM    743  NE2 HIS A  96       1.607   1.555   2.000
      
    • The [geometry]_orig.pdb file is, like the [geometry].pdb file, a PDB file with the input structure superimposed to the ideal structural template for [geometry]. At variance with the [geometry].pdb file, the distances between the metal and the atoms that coordinate the metal are transformed back to the values in the input structure. An example of this file is as follows:
      ATOM      1  M   TEV     1       0.000   0.000   0.000
      ATOM      2  L   TEV     1       1.732  -1.732  -1.732
      ATOM      3  L   TEV     1      -1.732  -1.732   1.732
      ATOM      4  L   TEV     1       1.732   1.732   1.732
      TER
      ATOM   2029 ZN    ZN A 262       0.000   0.000   0.000
      ATOM    722  NE2 HIS A  94       1.452  -1.464  -1.409
      ATOM    920  ND1 HIS A 119      -1.091  -1.138   1.405
      ATOM    743  NE2 HIS A  96       1.131   1.095   1.408
      
The FindGeo library of geometries
The library of FindGeo includes a total of 36 ideal coordination geometries, which are shown below. Each geometry is identified by a three-letter code.

CNName Description Figure
2
lin
 Linear
2
trv
 Trigonal plane with a vacancy
3
tri
 Trigonal plane
3
tev
 Tetrahedron with a vacancy
3
spv
 Square plane with a vacancy
4
tet
 Tetrahedron
4
spl
 Square plane
4
bva
 Trigonal bipyramid with a vacancy (axial)
4
bvp
 Trigonal bipyramid with a vacancy (equatorial)
4
pyv
 Square pyramid with a vacancy (equatorial)
5
spy
 Square pyramid
5
tbp
 Trigonal bipyramid
5
tpv
 Trigonal prism with a vacancy
6
oct
 Octahedron
6
tpr
 Trigonal prism
6
pva
 Pentagonal bipyramid with a vacancy (axial)
6
pvp
 Pentagonal bipyramid with a vacancy (equatorial)
6
cof
 Octahedron, face monocapped with a vacancy (capped face)
6
con
 Octahedron, face monocapped with a vacancy (non-capped face)
6
ctf
 Trigonal prism, square-face monocapped with a vacancy (capped face)
6
ctn
 Trigonal prism, square-face monocapped with a vacancy (non-capped edge)
7
pbp
 Pentagonal bipyramid
7
coc
 Octahedron, face monocapped
7
ctp
 Trigonal prism, square-face monocapped
7
hva
 Hexagonal bipyramid with a vacancy (axial)
7
hvp
 Hexagonal bipyramid with a vacancy (equatorial)
7
cuv
 Cube with a vacancy
7
sav
 Square antiprism with a vacancy
8
hbp
 Hexagonal bipyramid
8
cub
 Cube
8
sqa
 Square antiprism
8
boc
 Octahedron, trans-bicapped
8
bts
 Trigonal prism, square-face bicapped
8
btt
 Trigonal prism, triangular-face bicapped
9
ttp
 Trigonal prism, square-face tricapped
9
csa
 Square antiprism, square-face monocapped

Usage of the stand-alone version

FindGeo is also available as a stand-alone version which has all the functionality of the web version. It produces the same output files described above (see Output formats), which are placed in subdirectories each called by the identifier of the metal that they refer to (e.g., ZN_262_2041_A). In addition, an output file called FindGeo.summary is produced, which lists the coordination geometries for all the metals identified in the input structure.


Synopsis
python findgeo.py [options] –p <pdbfile> (To use a local PDB file as input)
python findgeo.py [options] –c <pdbcode> (To download a PDB file from the PDB website)

Options
-h	--help				Help; print a brief reminder of command line usage and all available options.
-w	--wdir				Specify the directory where the input PDB file is found (when the –p option is used) or is to be downloaded (when the –c option is used),
					and where the output is to be placed. Default is ./. 
-t	--threshold			Specify the threshold distance for metal coordination by donor atoms. Default is 2.8 Å.
-e	--excluded_donors		Specify the chemical symbol(s) of the atom(s) (separated by commas) that must not be taken into account for metal coordination. Default is C and H.
-m	--metal				Specify the chemical symbol of the metal whose coordination geometry is to be determined. Default is all metals.
-o	--overwrite			Overwrite existing files and directories with the same names as those generated by the program. By default, this option 
					is not active and the program will stop when trying to create files or directories that already exist.

Reference
More detailed documentation can be found in the article where FindGeo is described:

Andreini C, Cavallaro G, Lorenzini S.
FindGeo: a tool for determining metal coordination geometry.
Bioinformatics 28, 1658-1660 (2012). [PMID: 22556364]

Please cite this article to reference FindGeo in publications.