2.18. Fragment Specification¶
Atoms in a calculation can be grouped into specific fragments, which serve multiple purposes. Fragment definitions can be used to assign Basis Sets and ECPs, organize output in the population analysis section, and enable features like fragment constrain optimization and Rigid Body Optimization. They are also used in local energy decomposition and multi-level calculations.
2.18.1. Fragments defined on Input File¶
There are three ways to assign atoms to fragments using the input file.
The first method is to assign a specific atom to a specific fragment by placing
(n) directly after the atomic symbol in the coordinates section.
*xyz -2 2
 Cu(1)  0.00  0.00  0.00
 Cl(2)  2.25  0.00  0.00
 Cl(2) -2.25  0.00  0.00
 Cl(2)  0.00  2.25  0.00
 Cl(2)  0.00 -2.25  0.00
*
In this example the fragment feature is used to divide the molecule into a “metal” and a “ligand” fragment and consequently the program will print the metal and ligand charges and populations.
----------------------------------------------
CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM)
----------------------------------------------
 FRAGMENT 1
  Cu    0.000000    0.000000    0.000000
 FRAGMENT 2
  Cl    2.250000    0.000000    0.000000
  Cl   -2.250000    0.000000    0.000000
  Cl    0.000000    2.250000    0.000000
  Cl    0.000000   -2.250000    0.000000
...
----------------------------------------------
MULLIKEN FRAGMENT CHARGES AND SPIN POPULATIONS
----------------------------------------------
 Fragment   0 :   0.752589     0.842580
 Fragment   1 :  -2.752589     0.157420
Sum of fragment charges         :   -2.0000000
Sum of fragment spin populations:    1.0000000
...
--------------------------------------------
LOEWDIN FRAGMENT CHARGES AND SPIN POPULATONS
--------------------------------------------
 Fragment   0 :   0.222028     0.851552
 Fragment   1 :  -2.222028     0.148448
Alternatively, the %coords block can be used for fragment definitions in the
same way—by placing (n)  directly after the atomic symbol.
%coords
 CTyp   xyz  # the type of coordinates xyz or internal
 Charge -2   # the total charge of the molecule
 Mult   2    # the multiplicity = 2S+1
 coords
    Cu(1)  0.00  0.00  0.00
    Cl(2)  2.25  0.00  0.00
    Cl(2) -2.25  0.00  0.00
    Cl(2)  0.00  2.25  0.00
    Cl(2)  0.00 -2.25  0.00
 end
end
Important
In cases where all atoms are explicitly assigned to fragments, the fragment numbering must start at 1 and use consecutive integers. Non-consecutive or incorrect numbering may lead to errors.
If any atom is left unassigned (or explicitly assigned to fragment 0), it will be automatically assigned to a fragment using the fragmentation procedure described in Automatic Fragmentation section. In such cases—where only a subset of atoms is manually assigned to fragments—it is not necessary to use consecutive fragment numbers. However, the highest fragment number specified in the input must be less than the total number of fragments generated by the combination of manual and automatic procedures. If this condition is not met, ORCA will automatically reorder all fragment numbers in ascending order, starting from 1.
Finally, a third way to define fragments consists of using a Definition inside
the %frag block. In this scheme, the fragment number comes first, followed by a
list of atoms (enumerated starting from 0) enclosed in curly brackets {} and finishing with end.
Consecutive atoms can also be specified using the notation initial_atom:final_atom.
*xyz -2 2
 Cu  0.00  0.00  0.00
 Cl  2.25  0.00  0.00
 Cl -2.25  0.00  0.00
 Cl  0.00  2.25  0.00
 Cl  0.00 -2.25  0.00
*
%frag
 Definition
  1 {0} end    # atom 0 for fragment 1
  2 {1:4} end  # atoms 1 to 4 for fragment 2
 end
end
*xyz -2 2
 Cl  2.25  0.00  0.00
 Cl -2.25  0.00  0.00
 Cu  0.00  0.00  0.00
 Cl  0.00  2.25  0.00
 Cl  0.00 -2.25  0.00
*
%frag
 Definition
  1 {2} end        # atom 2 for fragment 1
  2 {0:1 3:4} end  # atoms 0, 1, 3, and 4 for fragment 2
 end
end
Note
With the last option (
Definition) the%fragblock has to be written after the coordinate section.%frag Definitionalso works with coordinates that are defined via an external file.
2.18.2. Automatic Fragmentation¶
Starting with ORCA 6.1, a set of automatic fragmentation algorithms has been introduced to recognize and group atoms into fragments automatically.
The automatic fragmentation procedure is triggered by including a %frag
block in the input file or when only a subset of atoms has been manually
assigned to fragments.
Automatic fragmentation is performed using a set of procedures that can be
selected by the user within the %frag block. Each procedure attempts to
identify new fragments among atoms that have not yet been assigned. For
example, consider a case below where the first 11 atoms belong to a propane
molecule and the last three belong to a water molecule. The Water procedure
in FragProc identifies the last three atoms as a water molecule and assigns
them to the first fragment, while the FunctionalGroups procedure detects
and assigns the CH3 and CH2 groups of propane as fragments 2 to 4.
%frag
   FragProc Water, FunctionalGroups
end
* xyz 0 1
 C     0.000000     1.270000    -0.260000
 C     0.000000    -0.000000     0.580000
 C     0.000000    -1.270000    -0.260000
 H    -0.890000     1.320000    -0.910000
 H     0.890000     1.320000    -0.910000
 H     0.000000     2.180000     0.370000
 H     0.880000    -0.000000     1.250000
 H    -0.880000    -0.000000     1.250000
 H     0.890000    -1.320000    -0.910000
 H     0.000000    -2.180000     0.370000
 H    -0.880000    -1.300000    -0.910000
 H    -0.920000     0.850000    -2.430000
 O    -1.690000     0.830000    -3.000000
 H    -1.640000     1.630000    -3.510000
*
  ----------------------------------------------
  CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM)
  ----------------------------------------------
  
   FRAGMENT 1
    H    -0.920000    0.850000   -2.430000
    O    -1.690000    0.830000   -3.000000
    H    -1.640000    1.630000   -3.510000
  
   FRAGMENT 2
    C     0.000000    1.270000   -0.260000
    H    -0.890000    1.320000   -0.910000
    H     0.890000    1.320000   -0.910000
    H     0.000000    2.180000    0.370000
  
   FRAGMENT 3
    C     0.000000   -1.270000   -0.260000
    H     0.890000   -1.320000   -0.910000
    H     0.000000   -2.180000    0.370000
    H    -0.880000   -1.300000   -0.910000
  
   FRAGMENT 4
    C     0.000000   -0.000000    0.580000
    H     0.880000   -0.000000    1.250000
    H    -0.880000   -0.000000    1.250000
This automatic fragmentation yields the same result as the manual fragment definition shown below, without the need to inspect the geometry and assign fragments manually.
%frag
 Definition
 1 { 11:13} end  # water
 2 { 0 3:5} end  # CH3
 3 { 2 8:10} end # CH3
 4 { 1 6:7} end  # CH2
 end
end
Note
Any fragment defined in the input file take precedence over automatic assignments.
ORCA supports up to 10 procedures in
FragProc, with the full list provided in Table 2.60 and Table 2.61.Constrained Fragments do not enable automatically the Automatic Fragmentation when there are atoms unassigned to fragments. However, automatic fragmentation can be activated by including a
%fragblock in the input file.
ORCA provides over 30 FragProc methods, which can be combined in a list to
generate fragments for various purposes.
Below are explanations and examples of the different procedures.
2.18.2.1. Automatic Fragmentation: Connectivity¶
FragProc Connectivity groups atoms that are connected by chemical bonds,
estimated based on atomic radii. In the example below, the first nine atoms
belong to a dimethyl ether molecule, which is automatically detected and
assigned to one fragment, while the last three atoms—belonging to a water
molecule—are assigned to a second fragment.
%frag
  PrintLevel 3
  FragProc Connectivity
end
*xyz 0 1
 O     0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.380000
 C     1.300000     0.000000    -0.460000
 H    -0.500000     0.870000     1.740000
 H    -0.500000    -0.870000     1.740000
 H     1.000000     0.000000     1.740000
 H     1.300000     0.000000    -1.530000
 H     1.800000     0.870000    -0.100000
 H     1.800000    -0.870000    -0.100000
 H    -1.840000     0.000000    -0.650000
 O    -2.440000     0.000000     0.080000
 H    -3.300000     0.000000    -0.300000
*
In this case, the output of the automatic fragmentation tool indicates that
two fragments were assigned by the Connectivity procedure.
------------------------------------------
 Assigned Fragment: 1 
------------------------------------------
  Name: Connectivity:0 Method: Connectivity 
  Natoms: 9 Charge: 0 Mult: 1 
 0 O     0.000000    0.000000    0.000000 
 1 C     0.000000    0.000000    1.380000 
 2 C     1.300000    0.000000   -0.460000 
 3 H    -0.500000    0.870000    1.740000 
 4 H    -0.500000   -0.870000    1.740000 
 5 H     1.000000    0.000000    1.740000 
 6 H     1.300000    0.000000   -1.530000 
 7 H     1.800000    0.870000   -0.100000 
 8 H     1.800000   -0.870000   -0.100000 
------------------------------------------
------------------------------------------
 Assigned Fragment: 2 
------------------------------------------
  Name: Connectivity:1 Method: Connectivity 
  Natoms: 3 Charge: 0 Mult: 1 
 9 H    -1.840000    0.000000   -0.650000 
 10 O    -2.440000    0.000000    0.080000 
 11 H    -3.300000    0.000000   -0.300000 
------------------------------------------
Each fragmentation procedure applies only to atoms that have not already been
assigned to a fragment. Consequently, connectivity-based fragmentation will
ignore any bonds to atoms that have been assigned in the input file or by a
previous FragProc.
For example, if in the previous case example the oxygen atom is manually assigned
to fragment 1 in the input file, this manual assignment takes precedence over the
FragProc Connectivity. As a result, four fragments are created: one fragment
containing the manually assigned oxygen atom, two CH3 groups, and one
water molecule.
%frag
  PrintLevel 3
  FragProc Connectivity
end
*xyz 0 1
 O(1)  0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.380000
 C     1.300000     0.000000    -0.460000
 H    -0.500000     0.870000     1.740000
 H    -0.500000    -0.870000     1.740000
 H     1.000000     0.000000     1.740000
 H     1.300000     0.000000    -1.530000
 H     1.800000     0.870000    -0.100000
 H     1.800000    -0.870000    -0.100000
 H    -1.840000     0.000000    -0.650000
 O    -2.440000     0.000000     0.080000
 H    -3.300000     0.000000    -0.300000
*
The output of the automatic fragmentation tool indicates that the first
fragment was assigned by the Orca_Input procedure, while the remaining three
fragments were assigned by the Connectivity procedure.
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
 Assigned Fragment: 1
------------------------------------------
  Name: User-defined:0 Method: Orca_input
  Natoms: 1 Charge: 0 Mult: 1 InputId: 1
 0 O     0.000000    0.000000    0.000000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Connectivity
===================================================
------------------------------------------
 Assigned Fragment: 2 
------------------------------------------ 
  Name: Connectivity:1 Method: Connectivity
  Natoms: 4 Charge: 0 Mult: 1 
 1 C     0.000000    0.000000    1.380000 
 3 H    -0.500000    0.870000    1.740000 
 4 H    -0.500000   -0.870000    1.740000 
 5 H     1.000000    0.000000    1.740000 
------------------------------------------
------------------------------------------
 Assigned Fragment: 3 
------------------------------------------
  Name: Connectivity:2 Method: Connectivity 
  Natoms: 4 Charge: 0 Mult: 1 
 2 C     1.300000    0.000000   -0.460000 
 6 H     1.300000    0.000000   -1.530000 
 7 H     1.800000    0.870000   -0.100000 
 8 H     1.800000   -0.870000   -0.100000
------------------------------------------
------------------------------------------
 Assigned Fragment: 4
------------------------------------------
  Name: Connectivity:3 Method: Connectivity
  Natoms: 3 Charge: 0 Mult: 1
 9 H    -1.840000    0.000000   -0.650000
 10 O    -2.440000    0.000000    0.080000
 11 H    -3.300000    0.000000   -0.300000
------------------------------------------
2.18.2.2. Automatic Fragmentation: Atomic and NotAssigned¶
FragProc Atomic and FragProc NotAssigned are termination procedures.
FragProc Atomic assigns each previously unassigned atom to its own individual fragment,
while FragProc NotAssigned assigns all remaining unassigned atoms to a single fragment.
Similar to other FragProc methods, the output of the automatic fragmentation tool indicates
that the Atomic or Not_assigned procedure has been used to generate the corresponding fragments.
%frag
  PrintLevel 3
  FragProc Atomic
end
*xyz 0 1
 O(1)  0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.380000
 C     1.300000     0.000000    -0.460000
 H    -0.500000     0.870000     1.740000
 H    -0.500000    -0.870000     1.740000
 H     1.000000     0.000000     1.740000
 H     1.300000     0.000000    -1.530000
 H     1.800000     0.870000    -0.100000
 H     1.800000    -0.870000    -0.100000
 H    -1.840000     0.000000    -0.650000
 O    -2.440000     0.000000     0.080000
 H    -3.300000     0.000000    -0.300000
*
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
 Assigned Fragment: 1
------------------------------------------
  Name: User-defined:0 Method: Orca_input
  Natoms: 1 Charge: 0 Mult: 1 InputId: 1
 0 O     0.000000    0.000000    0.000000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Atomic
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 2
------------------------------------------
  Name: C  Method: Atomic
  Natoms: 1 Charge: 0 Mult: 1
 1 C     0.000000    0.000000    1.380000
------------------------------------------
------------------------------------------
 Match: 2, Assigned Fragment: 3
------------------------------------------
  Name: C  Method: Atomic
  Natoms: 1 Charge: 0 Mult: 1
 2 C     1.300000    0.000000   -0.460000
------------------------------------------
...
------------------------------------------
 Match: 11, Assigned Fragment: 12
------------------------------------------
  Name: H  Method: Atomic
  Natoms: 1 Charge: 0 Mult: 1
 11 H    -3.300000    0.000000   -0.300000
------------------------------------------
%frag
  PrintLevel 3
  FragProc NotAssigned
end
*xyz 0 1
 O(1)  0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.380000
 C     1.300000     0.000000    -0.460000
 H    -0.500000     0.870000     1.740000
 H    -0.500000    -0.870000     1.740000
 H     1.000000     0.000000     1.740000
 H     1.300000     0.000000    -1.530000
 H     1.800000     0.870000    -0.100000
 H     1.800000    -0.870000    -0.100000
 H    -1.840000     0.000000    -0.650000
 O    -2.440000     0.000000     0.080000
 H    -3.300000     0.000000    -0.300000
*
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
 Assigned Fragment: 1
------------------------------------------ 
  Name: User-defined:0 Method: Orca_input 
  Natoms: 1 Charge: 0 Mult: 1 InputId: 1
 0 O     0.000000    0.000000    0.000000
------------------------------------------
===================================================
Tfragmentator: Setting not assigned atoms to a fragment
===================================================
------------------------------------------
 Assigned Fragment: 1 
------------------------------------------
  Name: Not Assigned Method: Not_assigned
  Natoms: 11 Charge: 0 Mult: 1 
 1 C     0.000000    0.000000    1.380000 
 2 C     1.300000    0.000000   -0.460000
 3 H    -0.500000    0.870000    1.740000
 4 H    -0.500000   -0.870000    1.740000
 5 H     1.000000    0.000000    1.740000 
 6 H     1.300000    0.000000   -1.530000  
 7 H     1.800000    0.870000   -0.100000
 8 H     1.800000   -0.870000   -0.100000
 9 H    -1.840000    0.000000   -0.650000  
 10 O    -2.440000    0.000000    0.080000 
 11 H    -3.300000    0.000000   -0.300000
------------------------------------------
 
Note
FragProc NotAssignedis always applied at the end of allFragProcprocedures to ensure that no atoms remain without a fragment assignment.
2.18.2.3. Automatic Fragmentation: Internal Libraries¶
ORCA includes a series of internal libraries containing definitions of many common molecular structures. In all cases, structure recognition is performed using a VF2 subgraph isomorphism algorithm applied to molecular graphs constructed from Cartesian coordinates.
The available fragmentation procedures that make use of internal libraries are listed in Table 2.60.
Fragment detection Keyword  | 
Description  | 
|---|---|
  | 
Contains a list of the most common organic functional groups.  | 
  | 
Contains a list of all amino acids, including all common protonation states, but excluding zwitterionic forms.  | 
  | 
Contains fragment definitions for amino acid backbone detection.  | 
  | 
Performs   | 
  | 
Similar to   | 
  | 
Contains a list of all amino acid side chains.  | 
  | 
Contains a detailed list of organic functional groups within amino acid side chains.  | 
  | 
Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 3′ position.  | 
  | 
Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 5′ position.  | 
  | 
Same as   | 
  | 
Contains a list of all nucleic acids.  | 
  | 
Contains a list of all nucleic acid side chains.  | 
  | 
Contains definitions for common solvents: 1-octanol, n-hexane, cyclohexane, toluene, chlorobenzene, tetrahydrofuran, benzene, N,N-dimethylformamide, pyridine, dimethyl sulfoxide, acetone, ethanol, acetonitrile, methanol, chloroform, carbon tetrachloride, dichloromethane, ammonia, and water.  | 
  | 
Contains definitions of water molecules, faster than   | 
Similar to other fragmentation procedures, listing multiple libraries in
FragProc will apply the procedures consecutively. Using the first example
listed in Automatic Fragmentation
(input), and including
PrintLevel 3 in the %frag block to increase verbosity, the output
will indicate the assignment of fragments by the Water and Functional_groups
procedures.
===================================================
Tfragmentator: Fragmenting by Water
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 1
------------------------------------------
  Name:  WATER Method: Water
  Natoms: 3 Charge: 0 Mult: 1
 11 H    -0.920000    0.850000   -2.430000
 12 O    -1.690000    0.830000   -3.000000
 13 H    -1.640000    1.630000   -3.510000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Functional_groups
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 2
------------------------------------------
  Name:  CH3 Method: Functional_groups
  Natoms: 4 Charge: 0 Mult: 1
 0 C     0.000000    1.270000   -0.260000
 3 H    -0.890000    1.320000   -0.910000
 4 H     0.890000    1.320000   -0.910000
 5 H     0.000000    2.180000    0.370000
------------------------------------------
------------------------------------------
 Match: 2, Assigned Fragment: 3
------------------------------------------
  Name:  CH3 Method: Functional_groups
  Natoms: 4 Charge: 0 Mult: 1
 2 C     0.000000   -1.270000   -0.260000
 8 H     0.890000   -1.320000   -0.910000
 9 H     0.000000   -2.180000    0.370000
 10 H    -0.880000   -1.300000   -0.910000
------------------------------------------
------------------------------------------
 Match: 1, Assigned Fragment: 4
------------------------------------------
  Name:  CH2 Method: Functional_groups
  Natoms: 3 Charge: 0 Mult: 1
 1 C     0.000000   -0.000000    0.580000
 6 H     0.880000   -0.000000    1.250000
 7 H    -0.880000   -0.000000    1.250000
------------------------------------------
2.18.2.4. Automatic Fragmentation: External Libraries¶
The automated fragmentator also allows users to supply .xyz files via the
XZYFRAGLIB variable in %frag block, containing geometries that should
be recognized as fragments. The FragProc Extlib procedure automatically
converts each provided geometry into a molecular graph and then applies a
VF2 subgraph isomorphism algorithm—just as is done with the internal
libraries.
The following example uses definitions of CH3O and CH3 fragments
from the file Mylib.xyz to identify and generate fragments within the
dimethyl ether geometry provided below:
%frag
  PrintLevel 3
  FragProc Extlib
  XZYFRAGLIB "Mylib.xyz"
end
*xyz 0 1
 O     0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.380000
 C     1.300000     0.000000    -0.460000
 H    -0.500000     0.870000     1.740000
 H    -0.500000    -0.870000     1.740000
 H     1.000000     0.000000     1.740000
 H     1.300000     0.000000    -1.530000
 H     1.800000     0.870000    -0.100000
 H     1.800000    -0.870000    -0.100000
*
Where  Mylib.xyz contains definitions for methyl and methoxy groups.
    5
CHARGE 0 MULT 1 NAME CH3O
 O     0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.380000
 H     1.008807     0.000000     1.736663
 H    -0.328435     0.953845     1.736667
 H    -0.794950    -0.621083     1.736667
    4
CHARGE 0 MULT 1 NAME CH3
 C     0.000000     0.000000     0.000000
 H     0.000000     0.000000     1.070000
 H     1.008807     0.000000    -0.356663
 H    -0.328435    -0.953845    -0.356667
The result is the assignment of atoms 0, 1, 3, 4, and 5 to a
CH3O fragment, with the remaining atoms assigned to a
CH3 by the Ext_lib procedure.
===================================================
Tfragmentator: Fragmenting by Ext_lib
===================================================
****
**** There are 2 Ref structures found in file Mylib.xyz
****
------------------------------------------
 Match: 1, Assigned Fragment: 1
------------------------------------------
  Name:  CH3O Method: Ext_lib
  Natoms: 5 Charge: 0 Mult: 1
 0 O     0.000000    0.000000    0.000000
 1 C     0.000000    0.000000    1.380000
 3 H    -0.500000    0.870000    1.740000
 4 H    -0.500000   -0.870000    1.740000
 5 H     1.000000    0.000000    1.740000
------------------------------------------
------------------------------------------
 Match: 1, Assigned Fragment: 2
------------------------------------------
  Name:  CH3 Method: Ext_lib
  Natoms: 4 Charge: 0 Mult: 1
 2 C     1.300000    0.000000   -0.460000
 6 H     1.300000    0.000000   -1.530000
 7 H     1.800000    0.870000   -0.100000
 8 H     1.800000   -0.870000   -0.100000
------------------------------------------
Note
XZYFRAGLIBallows the inclusion of up to 10 files. Each file may contain multiple fragment definitions; however, each definition must consist of a single, connected molecule—unconnected structures within a single definition are not allowed.The format of
Mylib.xyzfollows the standard XYZ file structure, but optionally supports three identifiers:CHARGE,MULT, andNAME, withNAMEexpected to appear last. These identifiers are printed when a fragment is recognized, though they are not currently used in any calculations.Subsequent modifications to fragments by other methods (see Fusebyatoms and Extend below) do not affect the fragment’s CHARGE or MULT values in the identifiers.
Important
ORCA’s VF2 subgraph isomorphism algorithm is based solely on atomic connectivity; it does not consider stereochemistry or bond orders.
Fragment assignment by
FragProc Extlibfollows the order of the files listed inXZYFRAGLIBand the order of fragment definitions within each file. Since an atom is excluded from subsequent matching once it has been assigned to a fragment, the order in which fragments are defined in the library is critically important. For example, in the previous case, if CH3 is defined before CH3O inMylib.xyz, both CH3 groups in the input file will be matched first. This prevents CH3O from being recognized later, even if it is present in the system.Atom assignment to fragments follows the order in which atoms appear in the geometry. In the previous example, the CH3O fragment could be assigned using either the C(1)-O or C(2)-O bond. However, since C(1) appears first in the geometry, it is selected for the CH3O fragment.
2.18.2.5. Automatic Fragmentation: Extend¶
Many fragments in the internal libraries represent incomplete molecular structures.
Therefore, in some cases, it is necessary to add additional hydrogen atoms—or, in
certain situations, a hydroxyl group—to complete the system. The FragProc Extend
addresses this by identifying oxygen atoms bonded to carbon atoms in previously assigned
fragments, as well as hydrogen atoms bonded to carbon, nitrogen, or oxygen.
It then extends the corresponding fragments to include these atoms.
Consider the example below, which consists of a zwitterionic glycine molecule fragmented
using an external library that defines a C–C–N fragment. In this case, FragProc Extend
can be applied to add the missing oxygen and hydrogen atoms, thereby completing the
molecular structure.
%frag
  PrintLevel 3
  FragProc Extlib, Extend
  XZYFRAGLIB "Mylib.xyz"
end
*xyz 0 1
 N     0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.460000
 C     1.403962     0.000000     2.015868
 O     1.837767     0.962613     2.627089
 O     2.161436    -0.947142     1.883370
 H     0.423874    -0.764689    -0.525339
 H    -0.538292    -0.884247     1.831954
 H    -0.538292     0.884247     1.831954
 H    -0.970478     0.016940    -0.313504
 H     0.499909     0.831989    -0.313504
*
Where Mylib.xyz in this case is:
    3
Name CCN
 C     0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.500000
 N     1.376503     0.000000    -0.486661
The result is an initial assignment of the C–C–N fragment, followed by an extension of the fragment to include two oxygen atoms and five hydrogen atoms.
------------------------------------------
 Match: 1, Assigned Fragment: 1 
------------------------------------------
  Name:  CCN Method: Ext_lib 
  Natoms: 3 Charge: 0 Mult: 1 
 0 N     0.000000    0.000000    0.000000 
 1 C     0.000000    0.000000    1.460000 
 2 C     1.403962    0.000000    2.015868 
------------------------------------------
===================================================
Tfragmentator: Extending Fragments
===================================================
 Extending Fragment 0 with O  (3)
 Extending Fragment 0 with O  (4)
 Extending Fragment 0 with H  (6)
 Extending Fragment 0 with H  (7)
 Extending Fragment 0 with H  (5)
 Extending Fragment 0 with H  (8)
 Extending Fragment 0 with H  (9) 
Important
FragProc Extendattempts to extend any previously assigned fragment.FragProc Extenddoes not modify theChargeandMultidentifiers of fragments.
2.18.2.6. Automatic Fragmentation: Fusebyatoms¶
When multiple fragmentation procedures are used in combination, it may be
necessary to merge two previously assigned fragments into one. This can be
achieved using the FragProc Fusebyatoms procedure, which identifies atom
pairs that should belong to the same fragment, as specified in FuseAtomPairs.
In the example below, the objective is to fragment a propane molecule into a
CH3 group and a CH3–CH2 fragment. One way to
achieve this is by first applying FragProc FunctionalGroups, which fragments
the molecule into two CH3 groups and one CH2 group.
The CH2 group can then be merged with one of the CH3 groups.
This is done by specifying the carbon atoms of the CH2 and one
CH3 group in the FuseAtomPairs directive: FuseAtomPairs {0 1} end.
%frag
        printlevel 3
        FragProc FunctionalGroups, Fusebyatoms
        FuseAtomPairs {0 1} end
end
*xyz 0 1
 C     0.000000     1.270000    -0.260000
 C     0.000000    -0.000000     0.580000
 C     0.000000    -1.270000    -0.260000
 H    -0.890000     1.320000    -0.910000
 H     0.890000     1.320000    -0.910000
 H     0.000000     2.180000     0.370000
 H     0.880000    -0.000000     1.250000
 H    -0.880000    -0.000000     1.250000
 H     0.890000    -1.320000    -0.910000
 H     0.000000    -2.180000     0.370000
 H    -0.880000    -1.300000    -0.910000
*
The output from the fragmentator first reports the initial fragmentation
performed by the Functional_groups library, followed by a message
indicating the subsequent fusion of fragments as specified by the
FuseAtomPairs directive.
===================================================
Tfragmentator: Fragmenting by Functional_groups
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 1
------------------------------------------
  Name:  CH3 Method: Functional_groups
  Natoms: 4 Charge: 0 Mult: 1
 0 C     0.000000    1.270000   -0.260000
 3 H    -0.890000    1.320000   -0.910000
 4 H     0.890000    1.320000   -0.910000
 5 H     0.000000    2.180000    0.370000
------------------------------------------
------------------------------------------
 Match: 2, Assigned Fragment: 2
------------------------------------------
  Name:  CH3 Method: Functional_groups
  Natoms: 4 Charge: 0 Mult: 1
 2 C     0.000000   -1.270000   -0.260000
 8 H     0.890000   -1.320000   -0.910000
 9 H     0.000000   -2.180000    0.370000
 10 H    -0.880000   -1.300000   -0.910000
------------------------------------------
------------------------------------------
 Match: 1, Assigned Fragment: 3
------------------------------------------
  Name:  CH2 Method: Functional_groups
  Natoms: 3 Charge: 0 Mult: 1
 1 C     0.000000   -0.000000    0.580000
 6 H     0.880000   -0.000000    1.250000
 7 H    -0.880000   -0.000000    1.250000
------------------------------------------
===================================================
Tfragmentator: Fusing Fragments
===================================================
Fusing Fragment 0 (atom 0) with Fragment 2 (atom 1)
2.18.2.7. Automatic Fragmentation: Delete and advanced fragmentation workflows¶
Almost all fragmentation schemes have an associated delete procedure, which removes the fragments generated by that specific scheme. This enables the construction of advanced fragmentation workflows, where a procedure can temporarily “protect” certain atoms from being fragmented by subsequent methods. These protected fragments can later be deleted, allowing the atoms to be reprocessed using a different fragmentation approach.
Consider the system below, which consists of both a phenylalanine molecule and a benzene molecule. The goal is to fragment the phenylalanine into its backbone, one CH2 group, and the phenyl ring, while fragmenting the benzene into six individual CH fragments.
On one hand, by combining the three fragmentation procedures — FragProc AABackbone, Extend, AASCFinegrained
— the desired fragmentation of phenylalanine can be achieved. The AABackbone and Extend
options identify the aminoacid backbone, while AASCFinegrained separates the
CH2 group and the phenyl ring into distinct fragments.
Meanwhile, the benzene molecule can be fragmented into six individual CH fragments using
FragProc Extlib, which relies on an external library defining the CH fragment.
However, these two fragmentation schemes are incompatible. When FragProc AASCFinegrained
is applied to the entire system, it fragments not only the phenylalanine side chain but
also breaks the benzene ring into a phenyl group and a hydrogen atom. Conversely, if
FragProc Extlib is applied first to benzene, it may inadvertently fragment the phenylalanine
residue, leading to undesired structural splits.
A viable approach to achieve the desired fragmentation involves first identifying the amino acid
using FragProc Aminoacids. This procedure assigns all atoms that belong to amino acid fragments,
thereby protecting them from reassignment by subsequent fragmentation steps. The benzene molecule
can then be fragmented using an external library via FragProc Extlib. Finally, the phenylalanine
fragment is removed using FragProc DelAminoacids, allowing it to be re-fragmented as needed.
All these operations are executed by simply concatenating the procedures as follows:
FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained
%frag
  PrintLevel 3
  FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained
  XZYFRAGLIB "Mylib.xyz"
  STOREFRAGS true
end
*xyz 0 1
 O     1.300780     3.093320     1.571260
 C     0.407700     2.826000     0.770390
 O    -0.101890     3.606620    -0.030470
 C    -0.115350     1.396690     0.770390
 N    -1.564350     1.396690     0.770390
 H    -1.900780     1.872730     1.595200
 H     0.242580     0.884560     1.663540
 H    -1.901490     0.444630     0.770390
 H    -1.901016     1.873069    -0.054118
 C     0.436090     0.696200    -0.461750
 H     1.512450     0.863150    -0.502770
 H    -0.018950     1.150700    -1.341790
 C     0.178410    -0.790680    -0.515340
 C    -0.895760    -1.288470    -1.262580
 C    -1.134680    -2.667040    -1.312270
 C    -0.299410    -3.547820    -0.614730
 C     0.774760    -3.050040     0.132500
 C     1.013680    -1.671470     0.182200
 H     1.842330    -1.287460     0.758630
 H     1.419110    -3.729500     0.670600
 H    -0.483720    -4.611290    -0.653070
 H    -1.963330    -3.051050    -1.888700
 H    -1.540110    -0.609010    -1.800680
 C     1.181325    -3.732146    -2.620584
 C     2.016592    -4.612923    -1.923046
 C     3.090772    -4.115131    -1.175824
 C     3.329684    -2.736563    -1.126139
 C     2.494417    -1.855785    -1.823677
 C     1.420237    -2.353577    -2.570900
 H     0.770518    -1.668458    -3.113485
 H     0.422252    -4.288369    -3.129823
 H     1.830753    -5.685253    -1.961693
 H     3.740491    -4.800250    -0.633239
 H     4.165243    -2.349352    -0.544907
 H     2.680256    -0.783455    -1.785030
*
Where “Mylib.xyz” in this case is:
2
NAME CH
C 0.00 0.00 0.00
H 1.08 0.00 0.00
The complete fragmentation sequence follows the same structure as in previous examples.
In this case, the message Deleted 1 fragments is printed after the fragment assignment
performed by Ext_lib, indicating that a fragment previously defined by
FragProc Aminoacids has been successfully removed.
===================================================
Tfragmentator: Fragmenting by Amino_Acids
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 1
------------------------------------------
  Name:  CPHE Method: Amino_Acids
  Natoms: 21 Charge: -1 Mult: 1
 0 O     1.300780    3.093320    1.571260
 1 C     0.407700    2.826000    0.770390
 2 O    -0.101890    3.606620   -0.030470
 3 C    -0.115350    1.396690    0.770390
 4 N    -1.564350    1.396690    0.770390
 5 H    -1.900780    1.872730    1.595200
 6 H     0.242580    0.884560    1.663540
 9 C     0.436090    0.696200   -0.461750
 10 H     1.512450    0.863150   -0.502770
 11 H    -0.018950    1.150700   -1.341790
 12 C     0.178410   -0.790680   -0.515340
 13 C    -0.895760   -1.288470   -1.262580
 14 C    -1.134680   -2.667040   -1.312270
 15 C    -0.299410   -3.547820   -0.614730
 16 C     0.774760   -3.050040    0.132500
 17 C     1.013680   -1.671470    0.182200
 18 H     1.842330   -1.287460    0.758630
 19 H     1.419110   -3.729500    0.670600
 20 H    -0.483720   -4.611290   -0.653070
 21 H    -1.963330   -3.051050   -1.888700
 22 H    -1.540110   -0.609010   -1.800680
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Ext_lib
===================================================
****
**** There are 1 Ref structures found in file Mylib.xyz
****
------------------------------------------
 Match: 1, Assigned Fragment: 2
------------------------------------------
  Name:  CH Method: Ext_lib
  Natoms: 2 Charge: 0 Mult: 1
 23 C     1.181325   -3.732146   -2.620584
 30 H     0.422252   -4.288369   -3.129823
------------------------------------------
------------------------------------------
 Match: 2, Assigned Fragment: 3
------------------------------------------
  Name:  CH Method: Ext_lib
  Natoms: 2 Charge: 0 Mult: 1
 24 C     2.016592   -4.612923   -1.923046
 31 H     1.830753   -5.685253   -1.961693
------------------------------------------
...
------------------------------------------
 Match: 6, Assigned Fragment: 7
------------------------------------------
  Name:  CH Method: Ext_lib
  Natoms: 2 Charge: 0 Mult: 1
 28 C     1.420237   -2.353577   -2.570900
 29 H     0.770518   -1.668458   -3.113485
------------------------------------------
===================================================
Tfragmentator: Deleting Fragments (Amino_Acids)
===================================================
 Deleted 1 fragments
===================================================
Tfragmentator: Fragmenting by Backbone
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 7
------------------------------------------
  Name: CO-NH3-CH Method: AA_Backbone
  Natoms: 8 Charge: 0 Mult: 0
 0 O     1.300780    3.093320    1.571260
 1 C     0.407700    2.826000    0.770390
 3 C    -0.115350    1.396690    0.770390
 4 N    -1.564350    1.396690    0.770390
 5 H    -1.900780    1.872730    1.595200
 6 H     0.242580    0.884560    1.663540
 7 H    -1.901490    0.444630    0.770390
 8 H    -1.901016    1.873069   -0.054118
------------------------------------------
===================================================
Tfragmentator: Extending Fragments
===================================================
 Extending Fragment 6 with O  (2)
===================================================
Tfragmentator: Fragmenting by AA_SideChains_FG
===================================================
------------------------------------------
 Match: 1, Assigned Fragment: 8
------------------------------------------
  Name: Ph Method: AA_SideChains_FG
  Natoms: 11 Charge: 0 Mult: 1
 12 C     0.178410   -0.790680   -0.515340
 13 C    -0.895760   -1.288470   -1.262580
 14 C    -1.134680   -2.667040   -1.312270
 15 C    -0.299410   -3.547820   -0.614730
 16 C     0.774760   -3.050040    0.132500
 17 C     1.013680   -1.671470    0.182200
 18 H     1.842330   -1.287460    0.758630
 19 H     1.419110   -3.729500    0.670600
 20 H    -0.483720   -4.611290   -0.653070
 21 H    -1.963330   -3.051050   -1.888700
 22 H    -1.540110   -0.609010   -1.800680
------------------------------------------
------------------------------------------
 Match: 1, Assigned Fragment: 9
------------------------------------------
  Name: CH2 Method: AA_SideChains_FG
  Natoms: 3 Charge: 0 Mult: 1
 9 C     0.436090    0.696200   -0.461750
 10 H     1.512450    0.863150   -0.502770
 11 H    -0.018950    1.150700   -1.341790
------------------------------------------
Table Table 2.61 lists the corresponding delete procedure associated with each fragmentation method.
Fragment detection Keyword  | 
Fragment deletion Keyword  | 
|---|---|
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
|
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
|
  | 
2.18.3. Options available in the %frag input block¶
Table Table 2.62 contains a list of the options available in the %frag input block.
Option  | 
Type  | 
Default  | 
Description  | 
|---|---|---|---|
  | 
Integer  | 
  | 
Verbose output control for automated fragmentation.  | 
  | 
Boolean  | 
  | 
Stores assigned fragments in a   | 
  | 
Boolean  | 
  | 
Automatically detects bonds between fragments for CoVaLED analysis.  | 
  | 
String  | 
  | 
Filenames used in   | 
  | 
See Table 2.60, and Table 2.61  | 
  | 
Fragmentation procedures to be applied automatically.  | 
  | 
Boolean  | 
  | 
Generate main geometry graph based on .prms file.  | 
  | 
String  | 
  | 
Topology file name to been used when   | 
  | 
Boolean  | 
  | 
Writes a   |