6.4. QM/MM via Interfaces to ORCA¶
ORCA is easy to interface as a QM engine in pretty much any QM/MM environment, as it will accept a set of point charges from an external file (see section Inclusion of Point Charges) and it will return, in a transparent format, all the required information for computing energies and gradients to the driving program. In our research group we have experience with four different QM/MM environments: ChemShell, Gromacs, pDynamo and NAMD. Each of the interfaces is described in the following section. It is beyond the scope of the manual to be exhaustive, and only minimal working examples are going to be presented. These will cover mainly the technical aspects with respect to the QM part of the QM/MM calculation. For the initial preparation of the system the user is referred to the documentation of the driving program.
6.4.1. ORCA and Gromacs¶
In cooperation with the developers of Gromacs software package we developed an interface to ORCA. The interface is available starting with Gromacs version 4.0.4 up to version 4.5.5.
As mentioned above, the initial setup has to be done with the Gromacs.
In the QM/MM calculation Gromacs will call ORCA to get the energy and
the gradients of the QM atoms. The interface can be used to perform all
job types allowed by the Gromacs software package, e.g. optimizations,
molecular dynamics, free energy. In addition, for geometry optimizations
we have implemented a microiterative scheme that can be requested by
setting the keyword bOpt = yes
in the Gromacs .mdp
file. This will
cause ORCA to perform a QM geometry optimization every time it is called
by Gromacs. During this optimization ORCA will also compute the
Lennard-Jones interaction between the QM and MM atoms, and freeze the
boundary atoms. This microiterative scheme can also be used to perform a
transition state optimization. If you are looking for a transition state
and have a good initial guess for the structure, you can carry out an
optimization of the MM system and at the same time perform a transition
state optimization of the QM system with ORCA via the microiterative
scheme. Since it is expensive to calculate the Hessian for each
microiterative step, the user can tell ORCA to use the
(updated) Hessian matrix of the previous step via read_temp_Hess
in
the ORCA input.
In order to allow full flexibility to the user, the information for the
QM run are provided in a separate plain text file, called
GromacsBasename.ORCAINFO
. When Gromacs writes the input for the
ORCA calculation, it will merge this file with the information on the
coordinates, point charges, Lennard-Jones coefficients and type of
calculation (EnGrad
, Opt
, TSOpt
).
Below is a typical example of an input file created by Gromacs, where for each Gromacs optimization step, a full optimization of the QM-part will be performed by ORCA, instead of just doing the energy and gradient calculation.
# Optimization step in the Lennard-Jones and point charges field of the MM atoms
! QMMMOpt
# file containing the Lennard Jones coefficients for the Lennard-Jones interaction
%LJCoefficients "temp.LJ"
# file containing the point charges for the electrostatic interaction
%pointcharges "temp.pc"
%geom
# calculate the exact Hessian before the first optimization step
Calc_Hess true
# in case of a TS optimization the updated Hessian of the previous
# TS optimization run is read instead of calculating a new one
read_temp_Hess true
end
Note
If you are using
bOpt
orbTS
you have to take care of additional terms over the boundary. Either you can zero out the dihedral terms of the Q2-Q1-M1-M2 configurations, or you can fix the respective Q2 atoms.If you want to use the ORCA constraints, you have to also put them in the Gromacs part of the calculation.
If there are no bonds between the QM and MM systems, the
bOpt
optimization of the QM system might have convergence problems. This is the case if the step size is too large and thus QM atoms come too close to MM atoms. The convergence problems can be circumvented by adding extra coordinates and Hessian diagonal values for Cartesian coordinates of selected QM atoms to the set of internal coordinates. This should be done for only few atoms that are in the boundary region.
%geom
# add the Cartesian position of atoms 2 and 4 to the
# set of internal coordinates with a diagonal Hessian value of 0.1
Hess_Internal
{C 2 D 0.1}
{C 4 D 0.1}
end
end
6.4.2. ORCA and pDynamo¶
The interface with the pDynamo library is briefly documented here. It uses the same plain text files to exchange information between the pDynamo library and ORCA. In order to have simiar functionality as presented above, we have extended the interface to generate more complex ORCA input files by accepting verbatim blocks of text. We have also implemented in pDynamo the capability of writing files containing Lennard-Jones parameters.
A simple example which calculates the QM/MM energy for a small designed peptide in which the side chain of one amino acid is treated QM is ilustrated below. For this example, a complete geometry optimization is going to be performed during the ORCA call.
import os
import pCore
import pBabel
import pMolecule
import pMoleculeScripts
# Read the initial coordinates from the .pdb file.
system = pMoleculeScripts.PDBFile_ToSystem(
"1UAO.pdb", modelNumber=1, useComponentLibrary=True)
# Instantiate the required models.
mmmodel = pMolecule.MMModelOPLS("protein")
nbmodel = pMolecule.NBModelORCA()
qcmodel = pMolecule.QCModelORCA(
command=os.getenv("ORCA_COMMAND"),
deleteJobFiles=False, header="bp86 def2-svp qmmmopt/pdynamo",
job="chignolin", run=True)
# Assign the models to the system.
system.DefineMMModel(mmmodel)
system.DefineQCModel(
qcmodel, qcSelection=pCore.Selection([35, 36, 37, 34, 40, 41]))
system.DefineNBModel(nbmodel)
system.electronicState = pMolecule.ElectronicState(
charge=-1, multiplicity=1)
# Print a summary and calculate the energy.
system.Summary()
system.Energy()
After the execution of the above Python program, a series of files are
going to be created chignolin.inp
, chignolin.pc
, chignolin.lj
and
ORCA is going to be called. The generated ORCA input file is listed
below.
! bp86 def2-svp qmmmopt/pdynamo
% geom
constraints
{C 0 C}
{C 1 C}
end
end
% pointcharges "chignolin.pc"
% ljcoefficients "chignolin.lj"
* xyz -1 1
H -1.0637532468 1.1350324675 2.4244220779
C -0.5230000000 0.6870000000 3.2490000000
C 0.4180000000 1.7240000000 3.8660000000
O -0.0690000000 2.7620000000 4.2830000000
O 1.6090000000 1.4630000000 3.9110000000
H -1.2240000000 0.3460000000 3.9970000000
H 0.0550000000 -0.1510000000 2.8890000000
*
There are few points that have to be raised here. Because the keyword
qmmm/pdynamo
was specified in the header
variable, the pDynamo
library will automatically add the constraint
block in the ORCA input,
which will freeze the link atoms and the QM atoms to which they are
bound. It will also generate the chignolin.lj
file containing all the
Lennard-Jones parameters. The important parts of this file, which is
somewhat different than the one generated by Gromacs, are listed next.
# number of atoms combination rule
138 0
# x y z sigma epsilon id
-6.778000 -1.424000 4.200000 3.250000 0.711280 -1
-6.878000 -0.708000 2.896000 3.500000 0.276144 -1
-5.557000 -0.840000 2.138000 3.750000 0.439320 -1
...
0.433000 0.826000 0.502000 2.960000 0.878640 -1
-0.523000 0.687000 3.249000 3.500000 0.276144 1
0.418000 1.724000 3.866000 3.750000 0.439320 2
-0.069000 2.762000 4.283000 2.960000 0.878640 3
1.609000 1.463000 3.911000 2.960000 0.878640 4
-2.259000 -0.588000 1.846000 0.000000 0.000000 -1
-1.795000 2.207000 2.427000 2.500000 0.125520 -1
-1.224000 0.346000 3.997000 2.500000 0.125520 5
0.055000 -0.151000 2.889000 2.500000 0.125520 6
-0.311000 2.922000 0.557000 3.250000 0.711280 -1
...
-1.387000 -2.946000 5.106000 2.500000 0.125520 -1
# number of special pairs
22
# atom1 atom2 factor
34 32 0.000000
35 39 0.500000
40 31 0.000000
41 30 0.500000
41 32 0.500000
36 31 0.500000
40 32 0.500000
40 39 0.500000
34 31 0.000000
35 30 0.500000
34 11 0.500000
34 38 0.500000
41 31 0.000000
37 31 0.500000
34 33 0.500000
34 39 0.000000
40 30 0.500000
41 39 0.500000
34 30 0.000000
35 31 0.000000
34 42 0.500000
35 32 0.500000
The second number on the first line refers to the type of combination
rule used to calculate the Lennard-Jones interaction. It is 0 if a
geometric average is used (OPLS force field), or 1 for the
Lorentz-Berthelot rules (AMBER force field). The id
on the last column
is -1 for MM atoms and is equal to the atom number for the QM atoms. In
this case the hydrogen link atom is atom 0. The last block of the file
is composed of atom pairs and a special factor by which their
Lennard-Jones interaction is scaled. In general this factor is equal to
1, but for atoms one or two bonds apart is zero, while for atoms three
bonds apart depends on the type of force field, and in this case is 0.5.
After successful completion of the ORCA optimization run, the information will be relayed back the pDynamo library, which will report the total QM/MM energy of the system. At this point the type QM/MM of calculation is limited only by the capabilities of the pDynamo library, which are quite extensive.
6.4.3. ORCA and NAMD¶
Since version 2.12, NAMD is able to perform hybrid QM/MM calculations. A more detailed explanation of all available key words, setting up the calculation and information on tutorials and on the upcoming graphic interface to VMD are available on the NAMD website.
Similar to other calculations with NAMD, the QM/MM is using a pdb file
to control the active regions. An example is shown below, where the
sidechain of a histidine protonated at N\(\epsilon\) is chosen to be the
QM region. Either the occupancy column or the b-factor column of the
file are used to indicate which atom are included in a QM area and which
are treated by the forcefield. In the other column, atoms which are
connecting the QM area and the MM part are indicated similarly. To
clarify which column is used for which purpose, the keywords qmColumn
and qmBondColumn
have to be defined in the NAMD input.
...
ATOM 1737 CA HSE P 117 14.762 47.946 31.597 1.00 0.00 PROT C
ATOM 1738 HA HSE P 117 14.751 47.579 32.616 0.00 0.00 PROT H
ATOM 1739 CB HSE P 117 14.129 49.300 31.501 1.00 1.00 PROT C
ATOM 1740 HB1 HSE P 117 14.407 49.738 30.518 0.00 1.00 PROT H
ATOM 1741 HB2 HSE P 117 13.024 49.194 31.509 0.00 1.00 PROT H
ATOM 1742 ND1 HSE P 117 13.899 51.381 32.779 0.00 1.00 PROT N
ATOM 1743 CG HSE P 117 14.572 50.261 32.582 0.00 1.00 PROT C
ATOM 1744 CE1 HSE P 117 14.615 52.043 33.669 0.00 1.00 PROT C
ATOM 1745 HE1 HSE P 117 14.356 53.029 34.064 0.00 1.00 PROT H
ATOM 1746 NE2 HSE P 117 15.678 51.318 33.982 0.00 1.00 PROT N
ATOM 1747 HE2 HSE P 117 16.369 51.641 34.627 0.00 1.00 PROT H
ATOM 1748 CD2 HSE P 117 15.706 50.183 33.335 0.00 1.00 PROT C
ATOM 1749 HD2 HSE P 117 16.451 49.401 33.388 0.00 1.00 PROT H
ATOM 1750 C HSE P 117 13.916 47.000 30.775 0.00 0.00 PROT C
ATOM 1751 O HSE P 117 12.965 46.452 31.334 0.00 0.00 PROT O
...
NOTES:
If one wants to include more than one QM region, integers bigger than 1 can be used to define the different regions.
Charge groups cannot be split when selecting QM and MM region. The reason is that non-integer partial charges may occur if a charge group is split. Since the QM partial charges are updated in every QM iteration, this may lead to a change in the total charge of the system over the course of the MD simulation.
The occupancy and b-factor columns are used for several declarations in NAMD. If two of these come together in one simulation, the keyword
qmParamPDB
is used to define which pdb file contains the information about QM atoms and bonds.To simplify the selection of QM atoms and writing the pdb file a set of scripts is planned to be included in future releases of NAMD.
To run the calculation, the keyword qmForces on
must be set. To select
ORCA qmSoftware "orca"
must be specified and the path to the
executables must be given to qmExecPath
, as well as a directory where
the calculation is carried out (qmBaseDir
). To pass the method and
specifications from NAMD to ORCA qmConfigLine
is used. These lines
will be copied to the beginning of the input file and can contain both
simple input as well as block input. To ensure the calculation of the
gradient, the engrad
keyword should be used.
The geometry of the QM region including the selected links as well as
the MM point charges are copied to the ORCA inputfile automatically.
Multiplicity and charge can be defined using qmMult
and qmCharge
,
although the latter can be determined automatically by NAMD using the MM
parameters. It should be noted at this point that NAMD is capable to
handle more than one QM region per QM/MM calculation. Therefore for each
region, charge and multiplicity are expected. In the case of only one QM
region, the input looks like the following:
qmMult "1 1"
qmCharge "1 0"
Currently, two charge modes are available: Mulliken and CHELPG. They
have to be specified in the NAMD input using QMChargeMode
and in the
qmConfigLine
, respectively. Different embedding schemes, point charge
schemes and switching functions are available, which will be not further
discussed here. Another useful tool worth mentioning is the possibility
to call secondary executables before the first or after each QM software
execution using QMPrepProc
or QMSecProc
, respectively. Both are
called with the complete path and name to the QM input file, allowing
e.g. storage of values during an QM/MM-MD.
It is strongly emphasized that at this points both programs are constantly developed further. For the latest information, either the ORCA forum or the NAMD website should be consulted.