6.4. QM/MM via Interfaces to ORCA¶
ORCA is easy to interface as a QM engine in pretty much any QM/MM environment, as it will accept a set of point charges from an external file (see section Inclusion of Point Charges) and it will return, in a transparent format, all the required information for computing energies and gradients to the driving program. In our research group we have experience with four different QM/MM environments: ChemShell, Gromacs, pDynamo and NAMD. Each of the interfaces is described in the following section. It is beyond the scope of the manual to be exhaustive, and only minimal working examples are going to be presented. These will cover mainly the technical aspects with respect to the QM part of the QM/MM calculation. For the initial preparation of the system the user is referred to the documentation of the driving program.
6.4.1. ORCA and Gromacs¶
In cooperation with the developers of Gromacs software package we developed an interface to ORCA. The interface is available starting with Gromacs version 4.0.4 up to version 4.5.5.
As mentioned above, the initial setup has to be done with the Gromacs.
In the QM/MM calculation Gromacs will call ORCA to get the energy and
the gradients of the QM atoms. The interface can be used to perform all
job types allowed by the Gromacs software package, e.g. optimizations,
molecular dynamics, free energy. In addition, for geometry optimizations
we have implemented a microiterative scheme that can be requested by
setting the keyword bOpt = yes in the Gromacs .mdp file. This will
cause ORCA to perform a QM geometry optimization every time it is called
by Gromacs. During this optimization ORCA will also compute the
Lennard-Jones interaction between the QM and MM atoms, and freeze the
boundary atoms. This microiterative scheme can also be used to perform a
transition state optimization. If you are looking for a transition state
and have a good initial guess for the structure, you can carry out an
optimization of the MM system and at the same time perform a transition
state optimization of the QM system with ORCA via the microiterative
scheme. Since it is expensive to calculate the Hessian for each
microiterative step, the user can tell ORCA to use the
(updated) Hessian matrix of the previous step via read_temp_Hess in
the ORCA input.
In order to allow full flexibility to the user, the information for the
QM run are provided in a separate plain text file, called
GromacsBasename.ORCAINFO. When Gromacs writes the input for the
ORCA calculation, it will merge this file with the information on the
coordinates, point charges, Lennard-Jones coefficients and type of
calculation (EnGrad, Opt, TSOpt).
Below is a typical example of an input file created by Gromacs, where for each Gromacs optimization step, a full optimization of the QM-part will be performed by ORCA, instead of just doing the energy and gradient calculation.
# Optimization step in the Lennard-Jones and point charges field of the MM atoms
! QMMMOpt
# file containing the Lennard Jones coefficients for the Lennard-Jones interaction
%LJCoefficients "temp.LJ"
# file containing the point charges for the electrostatic interaction
%pointcharges "temp.pc"
%geom
    # calculate the exact Hessian before the first optimization step
    Calc_Hess true
    # in case of a TS optimization the updated Hessian of the previous
    # TS optimization run is read instead of calculating a new one
    read_temp_Hess true
 end
Note
If you are using
bOptorbTSyou have to take care of additional terms over the boundary. Either you can zero out the dihedral terms of the Q2-Q1-M1-M2 configurations, or you can fix the respective Q2 atoms.If you want to use the ORCA constraints, you have to also put them in the Gromacs part of the calculation.
If there are no bonds between the QM and MM systems, the
bOptoptimization of the QM system might have convergence problems. This is the case if the step size is too large and thus QM atoms come too close to MM atoms. The convergence problems can be circumvented by adding extra coordinates and Hessian diagonal values for Cartesian coordinates of selected QM atoms to the set of internal coordinates. This should be done for only few atoms that are in the boundary region.
%geom
    # add the Cartesian position of atoms 2 and 4 to the
    # set of internal coordinates with a diagonal Hessian value of 0.1
    Hess_Internal
    {C 2 D 0.1}
    {C 4 D 0.1}
    end
end
6.4.2. ORCA and pDynamo¶
The interface with the pDynamo library is briefly documented here. It uses the same plain text files to exchange information between the pDynamo library and ORCA. In order to have simiar functionality as presented above, we have extended the interface to generate more complex ORCA input files by accepting verbatim blocks of text. We have also implemented in pDynamo the capability of writing files containing Lennard-Jones parameters.
A simple example which calculates the QM/MM energy for a small designed peptide in which the side chain of one amino acid is treated QM is ilustrated below. For this example, a complete geometry optimization is going to be performed during the ORCA call.
import os
import pCore
import pBabel
import pMolecule
import pMoleculeScripts
# Read the initial coordinates from the .pdb file.
system = pMoleculeScripts.PDBFile_ToSystem(
        "1UAO.pdb", modelNumber=1, useComponentLibrary=True)
# Instantiate the required models.
mmmodel = pMolecule.MMModelOPLS("protein")
nbmodel = pMolecule.NBModelORCA()
qcmodel = pMolecule.QCModelORCA(
        command=os.getenv("ORCA_COMMAND"),
        deleteJobFiles=False, header="bp86 def2-svp qmmmopt/pdynamo",
        job="chignolin", run=True)
# Assign the models to the system.
system.DefineMMModel(mmmodel)
system.DefineQCModel(
        qcmodel, qcSelection=pCore.Selection([35, 36, 37, 34, 40, 41]))
system.DefineNBModel(nbmodel)
system.electronicState = pMolecule.ElectronicState(
        charge=-1, multiplicity=1)
# Print a summary and calculate the energy.
system.Summary()
system.Energy()
After the execution of the above Python program, a series of files are
going to be created chignolin.inp, chignolin.pc, chignolin.lj and
ORCA is going to be called. The generated ORCA input file is listed
below.
! bp86 def2-svp qmmmopt/pdynamo
% geom
    constraints
        {C 0 C}
        {C 1 C}
        end
    end
% pointcharges "chignolin.pc"
% ljcoefficients "chignolin.lj"
* xyz -1 1
H                  -1.0637532468        1.1350324675        2.4244220779
C                  -0.5230000000        0.6870000000        3.2490000000
C                   0.4180000000        1.7240000000        3.8660000000
O                  -0.0690000000        2.7620000000        4.2830000000
O                   1.6090000000        1.4630000000        3.9110000000
H                  -1.2240000000        0.3460000000        3.9970000000
H                   0.0550000000       -0.1510000000        2.8890000000
*
There are few points that have to be raised here. Because the keyword
qmmm/pdynamo was specified in the header variable, the pDynamo
library will automatically add the constraint block in the ORCA input,
which will freeze the link atoms and the QM atoms to which they are
bound. It will also generate the chignolin.lj file containing all the
Lennard-Jones parameters. The important parts of this file, which is
somewhat different than the one generated by Gromacs, are listed next.
# number of atoms  combination rule
138 0
#      x            y             z          sigma       epsilon   id
   -6.778000    -1.424000     4.200000     3.250000     0.711280   -1
   -6.878000    -0.708000     2.896000     3.500000     0.276144   -1
   -5.557000    -0.840000     2.138000     3.750000     0.439320   -1
...
    0.433000     0.826000     0.502000     2.960000     0.878640   -1
   -0.523000     0.687000     3.249000     3.500000     0.276144    1
    0.418000     1.724000     3.866000     3.750000     0.439320    2
   -0.069000     2.762000     4.283000     2.960000     0.878640    3
    1.609000     1.463000     3.911000     2.960000     0.878640    4
   -2.259000    -0.588000     1.846000     0.000000     0.000000   -1
   -1.795000     2.207000     2.427000     2.500000     0.125520   -1
   -1.224000     0.346000     3.997000     2.500000     0.125520    5
    0.055000    -0.151000     2.889000     2.500000     0.125520    6
   -0.311000     2.922000     0.557000     3.250000     0.711280   -1
...
   -1.387000    -2.946000     5.106000     2.500000     0.125520   -1
# number of special pairs
22
#        atom1        atom2    factor
          34           32     0.000000
          35           39     0.500000
          40           31     0.000000
          41           30     0.500000
          41           32     0.500000
          36           31     0.500000
          40           32     0.500000
          40           39     0.500000
          34           31     0.000000
          35           30     0.500000
          34           11     0.500000
          34           38     0.500000
          41           31     0.000000
          37           31     0.500000
          34           33     0.500000
          34           39     0.000000
          40           30     0.500000
          41           39     0.500000
          34           30     0.000000
          35           31     0.000000
          34           42     0.500000
          35           32     0.500000
The second number on the first line refers to the type of combination
rule used to calculate the Lennard-Jones interaction. It is 0 if a
geometric average is used (OPLS force field), or 1 for the
Lorentz-Berthelot rules (AMBER force field). The id on the last column
is -1 for MM atoms and is equal to the atom number for the QM atoms. In
this case the hydrogen link atom is atom 0. The last block of the file
is composed of atom pairs and a special factor by which their
Lennard-Jones interaction is scaled. In general this factor is equal to
1, but for atoms one or two bonds apart is zero, while for atoms three
bonds apart depends on the type of force field, and in this case is 0.5.
After successful completion of the ORCA optimization run, the information will be relayed back the pDynamo library, which will report the total QM/MM energy of the system. At this point the type QM/MM of calculation is limited only by the capabilities of the pDynamo library, which are quite extensive.
6.4.3. ORCA and NAMD¶
Since version 2.12, NAMD is able to perform hybrid QM/MM calculations. A more detailed explanation of all available key words, setting up the calculation and information on tutorials and on the upcoming graphic interface to VMD are available on the NAMD website.
Similar to other calculations with NAMD, the QM/MM is using a pdb file
to control the active regions. An example is shown below, where the
sidechain of a histidine protonated at N\(\epsilon\) is chosen to be the
QM region. Either the occupancy column or the b-factor column of the
file are used to indicate which atom are included in a QM area and which
are treated by the forcefield. In the other column, atoms which are
connecting the QM area and the MM part are indicated similarly. To
clarify which column is used for which purpose, the keywords qmColumn
and qmBondColumn have to be defined in the NAMD input.
...
ATOM   1737  CA  HSE P 117      14.762  47.946  31.597  1.00  0.00      PROT C
ATOM   1738  HA  HSE P 117      14.751  47.579  32.616  0.00  0.00      PROT H
ATOM   1739  CB  HSE P 117      14.129  49.300  31.501  1.00  1.00      PROT C
ATOM   1740  HB1 HSE P 117      14.407  49.738  30.518  0.00  1.00      PROT H
ATOM   1741  HB2 HSE P 117      13.024  49.194  31.509  0.00  1.00      PROT H
ATOM   1742  ND1 HSE P 117      13.899  51.381  32.779  0.00  1.00      PROT N
ATOM   1743  CG  HSE P 117      14.572  50.261  32.582  0.00  1.00      PROT C
ATOM   1744  CE1 HSE P 117      14.615  52.043  33.669  0.00  1.00      PROT C
ATOM   1745  HE1 HSE P 117      14.356  53.029  34.064  0.00  1.00      PROT H
ATOM   1746  NE2 HSE P 117      15.678  51.318  33.982  0.00  1.00      PROT N
ATOM   1747  HE2 HSE P 117      16.369  51.641  34.627  0.00  1.00      PROT H
ATOM   1748  CD2 HSE P 117      15.706  50.183  33.335  0.00  1.00      PROT C
ATOM   1749  HD2 HSE P 117      16.451  49.401  33.388  0.00  1.00      PROT H
ATOM   1750  C   HSE P 117      13.916  47.000  30.775  0.00  0.00      PROT C
ATOM   1751  O   HSE P 117      12.965  46.452  31.334  0.00  0.00      PROT O
...
NOTES:
If one wants to include more than one QM region, integers bigger than 1 can be used to define the different regions.
Charge groups cannot be split when selecting QM and MM region. The reason is that non-integer partial charges may occur if a charge group is split. Since the QM partial charges are updated in every QM iteration, this may lead to a change in the total charge of the system over the course of the MD simulation.
The occupancy and b-factor columns are used for several declarations in NAMD. If two of these come together in one simulation, the keyword
qmParamPDBis used to define which pdb file contains the information about QM atoms and bonds.To simplify the selection of QM atoms and writing the pdb file a set of scripts is planned to be included in future releases of NAMD.
To run the calculation, the keyword qmForces on must be set. To select
ORCA  qmSoftware "orca" must be specified and the path to the
executables must be given to qmExecPath, as well as a directory where
the calculation is carried out (qmBaseDir). To pass the method and
specifications from NAMD to ORCA  qmConfigLine is used. These lines
will be copied to the beginning of the input file and can contain both
simple input as well as block input. To ensure the calculation of the
gradient, the engrad keyword should be used.
The geometry of the QM region including the selected links as well as
the MM point charges are copied to the ORCA inputfile automatically.
Multiplicity and charge can be defined using qmMult and qmCharge,
although the latter can be determined automatically by NAMD using the MM
parameters. It should be noted at this point that NAMD is capable to
handle more than one QM region per QM/MM calculation. Therefore for each
region, charge and multiplicity are expected. In the case of only one QM
region, the input looks like the following:
qmMult          "1 1"
qmCharge        "1 0"
Currently, two charge modes are available: Mulliken and CHELPG. They
have to be specified in the NAMD input using QMChargeMode and in the
qmConfigLine, respectively. Different embedding schemes, point charge
schemes and switching functions are available, which will be not further
discussed here. Another useful tool worth mentioning is the possibility
to call secondary executables before the first or after each QM software
execution using QMPrepProc or QMSecProc, respectively. Both are
called with the complete path and name to the QM input file, allowing
e.g. storage of values during an QM/MM-MD.
It is strongly emphasized that at this points both programs are constantly developed further. For the latest information, either the ORCA forum or the NAMD website should be consulted.