# Computational Chemistry/Geometry optimization

Previous chapter - Semiempirical quantum chemistry

### Geometry Optimization

Important features to notice are that the potential is steeper on the inner side and shallower especially as it moves out towards dissociation. This concept is extended to polyatomic molecules where we have a potential energy surface in $n$ dimensions. ($n$ is 3N-6 where N is the number of atoms. The -6 comes from the translations and rotations of the molecule, the trivial vibrations referred to in the MOPAC printouts.) The bottom of the potential well is at the equilibrium bond length. In this region the potential function resembles a parabola $E={\frac {1}{2}}fr^{2}+c$ Optimization of the geometry of a diatomic is trivial if we can calculate the total energy we follow it downwards as a function of varying $r$ by Newton like procedures until we have points either side of the minimum. From this position we can keep bisecting the interval and using quadratic interpolation until the required accuracy is reached.

A vast improvement to this procedure has come from being able to differentiate the Hamiltonian with respect to the 3N nuclear position coordinates, ( $x,y,$ and $z$ on each atomic centre ). All the modern quantum chemistry programs now use this technique. Newton-Raphson like methods are used to descend down the surface and old gradients are remembered to refine the optimisation. Once in the quadratic region swift convergence to any accuracy is assured.

### Cartesian Coordinates and Z-Matrices

The boring problem of molecular geometries.

We have talked so far about geometries assuming they were expressed as matrices of the $x,y,z$ coordinates of the individual atoms. Sometimes a more useful and efficient way is to express the same unique geometrical arrangement as the bond lengths and bond angles of the molecule. This has factored out the 6 degrees of freedom corresponding to rotation and translation of the molecule.

### Practical Ways of Making Molecular Geometries

To write out the $z$ -matrix for a medium sized molecule by inspection is a possible but arduous task. To make the cartesian coordinates without computerized assistance is impossible. In practice one uses a molecular modelling system like the MACROMODEL ( Columbia University, New York ), available on UNIX systems. The molecular modelling system uses one of several standard force-fields to refine the built-up geometry. (Notice the terminology force-fields for classical calculations like MM2, hamiltonians for quantum methods like AM1 (Austin Method-1).)

The force-field calculation has all the connectivity of the molecule established by the modelling system so generating the matrices involves walking the tree of connectivities in the most chemically sensible way picking off the leaves, (Hydrogen atoms), with their dihedral angles as $z$ -matrix entries. In computing terminology trees grow downwards not up. The nodes of a tree are usually the atoms. Terminal nodes are called leaves.

#### n-pentane in Tree Form

                                        C
. . . .
.   .   .   .
C     H    H   H
. . .
.   .   .
C    H    H
. . .
.   .   .
C     H    H
. . .
.   .   .
C    H    H
. . .
.   .   .
H    H    H


The theoretical study of the connectivity of chemistry is called Chemical Graph Theory and is important in areas such as computer assisted chemical synthesis and bibliographic searching.

#### Polar Coordinates

                          z
|
|
|               * (atom)i
| theta(i)    . .
|           .   .
|))       .     .
|  )    . r(i)  .
|   ) .         .
|   .           .
| .             .
0------------------------y
/   .            .
/     ( .         .
/(   ((     .      .
/  (((          .   .
/   phi(i)          ..
/                     .
/
/
/
/
x


Notice that there is a handedness to the axis system, following the right hand rule. The $z$ -axis, (the principal axis), is the thumb, first finger gives $x$ , second finger $y$ . This convention is best adhered to as though left handed coordinate systems will work if consistent great trouble can be caused by stereochemistry, (proteins are made of only L-amino acids|).

#### Carbon Dioxide

                      :z
:
:
:
00             CC           00
0  0 ----------C  ----------0  0  ----------x
0  0 ----------C  ----------0  0
00 1           CC1          00 2
/
/
/
/
y


By inspection the cartesian coordinates can be written down

xyz
C000
O1r00
O2-r00

The symmetry point group is $D_{\infty h}$ but in quantum chemistry calculations often the algorithms require the use of Abelian, (~non degenerate~), groups only so $D_{2}h$ is used for carbon dioxide. ($D_{2}h$ is the ab-initio quantum chemist's favourite group as it has 8 irreducible representations making the calculation much smaller than the more common $C_{s}$ or $C_{2}v$ groups, e.g. Quinone with only two unique heavy atoms. The principal axis of rotation is almost always defined as the $z$ -axis.

#### Methyl Chloride

In methyl chloride the C-Cl bond is 1.8 angstroms the C-H bond 1.094 angstroms and the CL-C-H angle is 110.6667 degrees. in the diagram the Z-axis is along the C-CL bond.

                       CL
:
:
:
:
:
C --------------------y
/=.
/ = .
//  =  ..            (x out of page)
//   =   . .
/ /    =    . .
/ /    H(3)   . .
/-/             .-.
H(1)             H(2)


The obvious atom to make the origin is C. Cl will then be 1.8 up the $z$ -axis. All the hydrogens will have the $z$ -coordinate $1.094~*~\cos ~110.67$ i.e. $-~(1.094~*~\cos ~69.33)$ .

H(3) has a $y$ coordinate zero and an $x$ coordinate negative, (that is in a right-handed cartesian framework, see the Polar Coordinates diagram).

It is best always to use the conventional right handed axes. The memory trick is to think of a sheet of graph paper, $x$ is across as usual and $y$ up the paper. Positive z then comes out of the paper. Interestingly failure to get this right does not affect any results of calculations other than those of optical activity etc. but a more serious problem is when coordinates are used as building blocks. For instance if you combined a set of amino-acid coordinates from your calculation with some from a crystal structure or the molecular model builder you could inadvertently end up mixing D and L forms in the same peptide.

So the above negative $x$ value will be $1.094~*~\sin ~69.33$ . Now remembering this value as '$xc$ ', we can easily get the $x$ and $y$ coordinates of H2 and H3. These are related by rotations of 120 degrees and have identical coordinates except for a change of sign in the $y$ -direction. Now the $y$ -coordinate will be + or - $xc~*~\sin ~120$ . The $x$ -coordinate is cos~120.

#### Useful Table for Remembering Common Trig Values

0090
30${\frac {1}{2}}$ 60
45${\frac {1}{\sqrt {2}}}$ 45
60${\frac {\sqrt {3}}{2}}$ 30
9010

Sines read down the table and cosines up.

With the MACROMODEL modelling system one would bet the icon for CH4 and prod one of the hydrogens with the 'Cl' icon. The MM2 force-field could then be used to optimise the geometry.

#### Benzene

The calculation of the geometry for benzene is greatly simplified by placing the origin at the centre of symmetry. The carbon atoms are 1.39 angstroms away from the centre of symmetry and the hydrogens are 2.47 angstroms away. The triangle formed between two adjacent carbons and the centre of symmetry is an equilateral triangle. Cos and sin of 60 can be used to calculate the coordinates of H2 and C2. All the rest are related by sign changes.

        Y                                      H1
:                                      *
:                                      *
:                         H6           *            H2
:                          *          1*           *
*                            *       *   *1.39   *
* : *                            *   *       *   *
*   :   *                            *6         2*
*     :     *                          *           *
*     :.....*.........X                *           *
*           *                          *           *
*           *                          *5         3*
*       *                          *   *       *   *
*   *                          *       * 4 *       *
*                          *           *           *
H5           *            H3
*
*
H4

                 (Z-axis vertically out of page)
Benzene
Centre          0.000    0.000    0.000
C               0.000    1.390    0.000
H               0.000    2.470    0.000
C               1.204    0.695    0.000
H               2.139    1.235    0.000
C               0.000   -1.390    0.000
H               0.000   -2.470    0.000
C              -1.204   -0.695    0.000
H              -2.139   -1.235    0.000
C               1.204   -0.695    0.000
H               2.139   -1.235    0.000
C              -1.204    0.695    0.000
H              -2.139    1.235    0.000
**


One of the problems with the above coordinates which I worked out with a calculator is the precision. The atoms on the axes have infinite precision, (~zero|~), whereas the atom positions worked out have only 4-figures, (atoms 2, 3, 5, and 6). This manifests itself as a slight slippage from the full $D_{6}h$ symmetry, which will cause degenerate orbitals and equal atomic properties to be not quite equal.

Such unexpected occurrences in computational results are very common though the effects of only finite precision on both input data and computation are not usually serious. In quantum chemistry we need larger than usual accuracy, obtained by double precision arithmetic, because we are concerned with differences in energy between entities which contain large amounts of core electronic energy which 'uses~up' the left hand significant figures of the total energy. (Compared with high energy phenomena most of chemistry is concerned with small differences in energy.) If you were working out the coordinates of a molecule containing a phenyl group you would probably use the molecular modelling system but there are cases where there is high symmetry where you would start from highly symmetrical coordinates as above.

#### Projections

Newman Projection                             Sawhorse Projection
*****************                             *******************
.H6
.
*)                                                .
*  )                                             * C2
*    )                                         .. .
* 27  )   @                         H1      . .   .
@           .* .    )@                           .    .  .      .
@      .   *    .@                             .   H4.        .
@        *)                                  .   .           H5
@.     *  )   .                            . .
*120                             C1 *
.      *   )   .                         .   .
*   *)                            .       .
. *       *  .              Z       .           .
*           *              /       H2            H3
*    .        .  *           /
*         .  .       *        /
*          @             *     /
:------X      Z       Y
@                   :        (    ^      /         )
:        (    ^    /           )
@                    :        (    ^ /              )
Y        (    *---------X      )
(                     )


Newman projection is the most useful for quantitative working with dihedral angles. Programs exist to print out the dihedral angles from crystal structures or MACROMODEL coordinates.

#### Local Origins in Ring Systems

It is often desirable to describe a subsystem and shunt it into geometrical position. This rotation / translation process can be describes by a vector and three Euler Angles. A very suitable orientation for a model fragment is to make any aromatic part as planar as possible, (perhaps by least squares fitting). The origin of the cartesian system can be set to be the centre of the ring defined by the centre of mass assuming all the ring atoms to be carbon and no other atoms considered}. The reason for this is obvious if one thinks of a C6H4I- group. The centre of mass would be in the C-I bond and the centre of the ring system is a much more practical origin.

There are many ways of defining the Euler Angles. QM uses the three angles $\alpha ,\beta$ and $\gamma$ which are rotations about $z$ , $y^{'}$ then $z^{'}$ (i.e. $z$ , new $y$ , then the new $z$ axis.

$\cos(\alpha )$ $\sin(\alpha )$ 0
$-\sin(\alpha )$ $\cos(\alpha )$ 0
0 0 1

then

$\cos(\beta )$ 0 $\sin(\beta )$ 0 1 0
$-\sin(\beta )$ 0 $\cos(\beta )$ then

$\cos(\gamma )$ $\sin(\gamma )$ 0
$-\sin(\gamma )$ $\cos(\gamma )$ 0
0 0 1

Here is a piece of FORTRAN code for the resulting transformation matrix from the product of these three rotations.

  tran (1,1) = (cos (alpha) * cos (beta) * cos (gamma)) _
- sin (alpha) * sin (gamma)
tran (2,1) = (cos (alpha) * cos (beta) * sin (gamma)) _
+ (sin (alpha) * cos (gamma))
tran (3,1) = - cos (alpha) * sin (beta)
tran (1,2) = - ( (sin (alpha) * cos (beta) * cos (gamma)) _
+  ( cos (alpha) * sin (gamma))  )
tran (2,2) = - ( sin (alpha) * cos (beta) * sin (gamma) ) _
+ ( cos (alpha) * cos (gamma) )
tran (3,2) = sin (alpha) * sin (beta)
tran (1,3) = sin (beta) * cos (gamma)
tran (2,3) = sin (beta) * sin (gamma)
tran (3,3) = cos (beta)


### The optimization

Once you have your trial geometry, made by a graphical program or by trigonometry as describes previously one of the powerful quantum or molecular mechanics programs will solve the geometry by minimizing the (3N-6) free coordinates.

In many cases this does not give a unique geometry only one of many possible conformations.

Next Chapter - Applications of molecular quantum mechanics