2.18. Fragment Specification¶
Atoms in a calculation can be grouped into specific fragments, which serve multiple purposes. Fragment definitions can be used to assign Basis Sets and ECPs, organize output in the population analysis section, and enable features like fragment constrain optimization and Rigid Body Optimization. They are also used in local energy decomposition and multi-level calculations.
2.18.1. Fragments defined on Input File¶
There are three ways to assign atoms to fragments using the input file.
The first method is to assign a specific atom to a specific fragment by placing
(n)
directly after the atomic symbol in the coordinates section.
*xyz -2 2
Cu(1) 0.00 0.00 0.00
Cl(2) 2.25 0.00 0.00
Cl(2) -2.25 0.00 0.00
Cl(2) 0.00 2.25 0.00
Cl(2) 0.00 -2.25 0.00
*
In this example the fragment feature is used to divide the molecule into a “metal” and a “ligand” fragment and consequently the program will print the metal and ligand charges and populations.
----------------------------------------------
CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM)
----------------------------------------------
FRAGMENT 1
Cu 0.000000 0.000000 0.000000
FRAGMENT 2
Cl 2.250000 0.000000 0.000000
Cl -2.250000 0.000000 0.000000
Cl 0.000000 2.250000 0.000000
Cl 0.000000 -2.250000 0.000000
...
----------------------------------------------
MULLIKEN FRAGMENT CHARGES AND SPIN POPULATIONS
----------------------------------------------
Fragment 0 : 0.752589 0.842580
Fragment 1 : -2.752589 0.157420
Sum of fragment charges : -2.0000000
Sum of fragment spin populations: 1.0000000
...
--------------------------------------------
LOEWDIN FRAGMENT CHARGES AND SPIN POPULATONS
--------------------------------------------
Fragment 0 : 0.222028 0.851552
Fragment 1 : -2.222028 0.148448
Alternatively, the %coords
block can be used for fragment definitions in the
same way—by placing (n)
directly after the atomic symbol.
%coords
CTyp xyz # the type of coordinates xyz or internal
Charge -2 # the total charge of the molecule
Mult 2 # the multiplicity = 2S+1
coords
Cu(1) 0.00 0.00 0.00
Cl(2) 2.25 0.00 0.00
Cl(2) -2.25 0.00 0.00
Cl(2) 0.00 2.25 0.00
Cl(2) 0.00 -2.25 0.00
end
end
Important
In cases where all atoms are explicitly assigned to fragments, the fragment numbering must start at 1 and use consecutive integers. Non-consecutive or incorrect numbering may lead to errors.
If any atom is left unassigned (or explicitly assigned to fragment 0), it will be automatically assigned to a fragment using the fragmentation procedure described in Automatic Fragmentation section. In such cases—where only a subset of atoms is manually assigned to fragments—it is not necessary to use consecutive fragment numbers. However, the highest fragment number specified in the input must be less than the total number of fragments generated by the combination of manual and automatic procedures. If this condition is not met, ORCA will automatically reorder all fragment numbers in ascending order, starting from 1.
Finally, a third way to define fragments consists of using a Definition
inside
the %frag
block. In this scheme, the fragment number comes first, followed by a
list of atoms (enumerated starting from 0) enclosed in curly brackets {}
and finishing with end
.
Consecutive atoms can also be specified using the notation initial_atom
:final_atom
.
*xyz -2 2
Cu 0.00 0.00 0.00
Cl 2.25 0.00 0.00
Cl -2.25 0.00 0.00
Cl 0.00 2.25 0.00
Cl 0.00 -2.25 0.00
*
%frag
Definition
1 {0} end # atom 0 for fragment 1
2 {1:4} end # atoms 1 to 4 for fragment 2
end
end
*xyz -2 2
Cl 2.25 0.00 0.00
Cl -2.25 0.00 0.00
Cu 0.00 0.00 0.00
Cl 0.00 2.25 0.00
Cl 0.00 -2.25 0.00
*
%frag
Definition
1 {2} end # atom 2 for fragment 1
2 {0:1 3:4} end # atoms 0, 1, 3, and 4 for fragment 2
end
end
Note
With the last option (
Definition
) the%frag
block has to be written after the coordinate section.%frag Definition
also works with coordinates that are defined via an external file.
2.18.2. Automatic Fragmentation¶
Starting with ORCA 6.1, a set of automatic fragmentation algorithms has been introduced to recognize and group atoms into fragments automatically.
The automatic fragmentation procedure is triggered by including a %frag
block in the input file or when only a subset of atoms has been manually
assigned to fragments.
Automatic fragmentation is performed using a set of procedures that can be
selected by the user within the %frag
block. Each procedure attempts to
identify new fragments among atoms that have not yet been assigned. For
example, consider a case below where the first 11 atoms belong to a propane
molecule and the last three belong to a water molecule. The Water
procedure
in FragProc
identifies the last three atoms as a water molecule and assigns
them to the first fragment, while the FunctionalGroups
procedure detects
and assigns the CH3 and CH2 groups of propane as fragments 2 to 4.
%frag
FragProc Water, FunctionalGroups
end
* xyz 0 1
C 0.000000 1.270000 -0.260000
C 0.000000 -0.000000 0.580000
C 0.000000 -1.270000 -0.260000
H -0.890000 1.320000 -0.910000
H 0.890000 1.320000 -0.910000
H 0.000000 2.180000 0.370000
H 0.880000 -0.000000 1.250000
H -0.880000 -0.000000 1.250000
H 0.890000 -1.320000 -0.910000
H 0.000000 -2.180000 0.370000
H -0.880000 -1.300000 -0.910000
H -0.920000 0.850000 -2.430000
O -1.690000 0.830000 -3.000000
H -1.640000 1.630000 -3.510000
*
----------------------------------------------
CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM)
----------------------------------------------
FRAGMENT 1
H -0.920000 0.850000 -2.430000
O -1.690000 0.830000 -3.000000
H -1.640000 1.630000 -3.510000
FRAGMENT 2
C 0.000000 1.270000 -0.260000
H -0.890000 1.320000 -0.910000
H 0.890000 1.320000 -0.910000
H 0.000000 2.180000 0.370000
FRAGMENT 3
C 0.000000 -1.270000 -0.260000
H 0.890000 -1.320000 -0.910000
H 0.000000 -2.180000 0.370000
H -0.880000 -1.300000 -0.910000
FRAGMENT 4
C 0.000000 -0.000000 0.580000
H 0.880000 -0.000000 1.250000
H -0.880000 -0.000000 1.250000
This automatic fragmentation yields the same result as the manual fragment definition shown below, without the need to inspect the geometry and assign fragments manually.
%frag
Definition
1 { 11:13} end # water
2 { 0 3:5} end # CH3
3 { 2 8:10} end # CH3
4 { 1 6:7} end # CH2
end
end
Note
Any fragment defined in the input file take precedence over automatic assignments.
ORCA supports up to 10 procedures in
FragProc
, with the full list provided in Table 2.60 and Table 2.61.Constrained Fragments do not enable automatically the Automatic Fragmentation when there are atoms unassigned to fragments. However, automatic fragmentation can be activated by including a
%frag
block in the input file.
ORCA provides over 30 FragProc
methods, which can be combined in a list to
generate fragments for various purposes.
Below are explanations and examples of the different procedures.
2.18.2.1. Automatic Fragmentation: Connectivity¶
FragProc Connectivity
groups atoms that are connected by chemical bonds,
estimated based on atomic radii. In the example below, the first nine atoms
belong to a dimethyl ether molecule, which is automatically detected and
assigned to one fragment, while the last three atoms—belonging to a water
molecule—are assigned to a second fragment.
%frag
PrintLevel 3
FragProc Connectivity
end
*xyz 0 1
O 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
In this case, the output of the automatic fragmentation tool indicates that
two fragments were assigned by the Connectivity
procedure.
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: Connectivity:0 Method: Connectivity
Natoms: 9 Charge: 0 Mult: 1
0 O 0.000000 0.000000 0.000000
1 C 0.000000 0.000000 1.380000
2 C 1.300000 0.000000 -0.460000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
------------------------------------------
------------------------------------------
Assigned Fragment: 2
------------------------------------------
Name: Connectivity:1 Method: Connectivity
Natoms: 3 Charge: 0 Mult: 1
9 H -1.840000 0.000000 -0.650000
10 O -2.440000 0.000000 0.080000
11 H -3.300000 0.000000 -0.300000
------------------------------------------
Each fragmentation procedure applies only to atoms that have not already been
assigned to a fragment. Consequently, connectivity-based fragmentation will
ignore any bonds to atoms that have been assigned in the input file or by a
previous FragProc
.
For example, if in the previous case example the oxygen atom is manually assigned
to fragment 1 in the input file, this manual assignment takes precedence over the
FragProc Connectivity
. As a result, four fragments are created: one fragment
containing the manually assigned oxygen atom, two CH3 groups, and one
water molecule.
%frag
PrintLevel 3
FragProc Connectivity
end
*xyz 0 1
O(1) 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
The output of the automatic fragmentation tool indicates that the first
fragment was assigned by the Orca_Input
procedure, while the remaining three
fragments were assigned by the Connectivity
procedure.
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: User-defined:0 Method: Orca_input
Natoms: 1 Charge: 0 Mult: 1 InputId: 1
0 O 0.000000 0.000000 0.000000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Connectivity
===================================================
------------------------------------------
Assigned Fragment: 2
------------------------------------------
Name: Connectivity:1 Method: Connectivity
Natoms: 4 Charge: 0 Mult: 1
1 C 0.000000 0.000000 1.380000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
------------------------------------------
------------------------------------------
Assigned Fragment: 3
------------------------------------------
Name: Connectivity:2 Method: Connectivity
Natoms: 4 Charge: 0 Mult: 1
2 C 1.300000 0.000000 -0.460000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
------------------------------------------
------------------------------------------
Assigned Fragment: 4
------------------------------------------
Name: Connectivity:3 Method: Connectivity
Natoms: 3 Charge: 0 Mult: 1
9 H -1.840000 0.000000 -0.650000
10 O -2.440000 0.000000 0.080000
11 H -3.300000 0.000000 -0.300000
------------------------------------------
2.18.2.2. Automatic Fragmentation: Atomic and NotAssigned¶
FragProc Atomic
and FragProc NotAssigned
are termination procedures.
FragProc Atomic
assigns each previously unassigned atom to its own individual fragment,
while FragProc NotAssigned
assigns all remaining unassigned atoms to a single fragment.
Similar to other FragProc
methods, the output of the automatic fragmentation tool indicates
that the Atomic
or Not_assigned
procedure has been used to generate the corresponding fragments.
%frag
PrintLevel 3
FragProc Atomic
end
*xyz 0 1
O(1) 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: User-defined:0 Method: Orca_input
Natoms: 1 Charge: 0 Mult: 1 InputId: 1
0 O 0.000000 0.000000 0.000000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Atomic
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: C Method: Atomic
Natoms: 1 Charge: 0 Mult: 1
1 C 0.000000 0.000000 1.380000
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 3
------------------------------------------
Name: C Method: Atomic
Natoms: 1 Charge: 0 Mult: 1
2 C 1.300000 0.000000 -0.460000
------------------------------------------
...
------------------------------------------
Match: 11, Assigned Fragment: 12
------------------------------------------
Name: H Method: Atomic
Natoms: 1 Charge: 0 Mult: 1
11 H -3.300000 0.000000 -0.300000
------------------------------------------
%frag
PrintLevel 3
FragProc NotAssigned
end
*xyz 0 1
O(1) 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: User-defined:0 Method: Orca_input
Natoms: 1 Charge: 0 Mult: 1 InputId: 1
0 O 0.000000 0.000000 0.000000
------------------------------------------
===================================================
Tfragmentator: Setting not assigned atoms to a fragment
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: Not Assigned Method: Not_assigned
Natoms: 11 Charge: 0 Mult: 1
1 C 0.000000 0.000000 1.380000
2 C 1.300000 0.000000 -0.460000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
9 H -1.840000 0.000000 -0.650000
10 O -2.440000 0.000000 0.080000
11 H -3.300000 0.000000 -0.300000
------------------------------------------
Note
FragProc NotAssigned
is always applied at the end of allFragProc
procedures to ensure that no atoms remain without a fragment assignment.
2.18.2.3. Automatic Fragmentation: Internal Libraries¶
ORCA includes a series of internal libraries containing definitions of many common molecular structures. In all cases, structure recognition is performed using a VF2 subgraph isomorphism algorithm applied to molecular graphs constructed from Cartesian coordinates.
The available fragmentation procedures that make use of internal libraries are listed in Table 2.60.
Fragment detection Keyword |
Description |
---|---|
|
Contains a list of the most common organic functional groups. |
|
Contains a list of all amino acids, including all common protonation states, but excluding zwitterionic forms. |
|
Contains fragment definitions for amino acid backbone detection. |
|
Performs |
|
Similar to |
|
Contains a list of all amino acid side chains. |
|
Contains a detailed list of organic functional groups within amino acid side chains. |
|
Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 3′ position. |
|
Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 5′ position. |
|
Same as |
|
Contains a list of all nucleic acids. |
|
Contains a list of all nucleic acid side chains. |
|
Contains definitions for common solvents: 1-octanol, n-hexane, cyclohexane, toluene, chlorobenzene, tetrahydrofuran, benzene, N,N-dimethylformamide, pyridine, dimethyl sulfoxide, acetone, ethanol, acetonitrile, methanol, chloroform, carbon tetrachloride, dichloromethane, ammonia, and water. |
|
Contains definitions of water molecules, faster than |
Similar to other fragmentation procedures, listing multiple libraries in
FragProc
will apply the procedures consecutively. Using the first example
listed in Automatic Fragmentation
(input), and including
PrintLevel 3
in the %frag
block to increase verbosity, the output
will indicate the assignment of fragments by the Water
and Functional_groups
procedures.
===================================================
Tfragmentator: Fragmenting by Water
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: WATER Method: Water
Natoms: 3 Charge: 0 Mult: 1
11 H -0.920000 0.850000 -2.430000
12 O -1.690000 0.830000 -3.000000
13 H -1.640000 1.630000 -3.510000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Functional_groups
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
0 C 0.000000 1.270000 -0.260000
3 H -0.890000 1.320000 -0.910000
4 H 0.890000 1.320000 -0.910000
5 H 0.000000 2.180000 0.370000
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 3
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
2 C 0.000000 -1.270000 -0.260000
8 H 0.890000 -1.320000 -0.910000
9 H 0.000000 -2.180000 0.370000
10 H -0.880000 -1.300000 -0.910000
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 4
------------------------------------------
Name: CH2 Method: Functional_groups
Natoms: 3 Charge: 0 Mult: 1
1 C 0.000000 -0.000000 0.580000
6 H 0.880000 -0.000000 1.250000
7 H -0.880000 -0.000000 1.250000
------------------------------------------
2.18.2.4. Automatic Fragmentation: External Libraries¶
The automated fragmentator also allows users to supply .xyz
files via the
XZYFRAGLIB
variable in %frag
block, containing geometries that should
be recognized as fragments. The FragProc Extlib
procedure automatically
converts each provided geometry into a molecular graph and then applies a
VF2 subgraph isomorphism algorithm—just as is done with the internal
libraries.
The following example uses definitions of CH3O and CH3 fragments
from the file Mylib.xyz
to identify and generate fragments within the
dimethyl ether geometry provided below:
%frag
PrintLevel 3
FragProc Extlib
XZYFRAGLIB "Mylib.xyz"
end
*xyz 0 1
O 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
*
Where Mylib.xyz
contains definitions for methyl and methoxy groups.
5
CHARGE 0 MULT 1 NAME CH3O
O 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
H 1.008807 0.000000 1.736663
H -0.328435 0.953845 1.736667
H -0.794950 -0.621083 1.736667
4
CHARGE 0 MULT 1 NAME CH3
C 0.000000 0.000000 0.000000
H 0.000000 0.000000 1.070000
H 1.008807 0.000000 -0.356663
H -0.328435 -0.953845 -0.356667
The result is the assignment of atoms 0, 1, 3, 4, and 5 to a
CH3O fragment, with the remaining atoms assigned to a
CH3 by the Ext_lib
procedure.
===================================================
Tfragmentator: Fragmenting by Ext_lib
===================================================
****
**** There are 2 Ref structures found in file Mylib.xyz
****
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CH3O Method: Ext_lib
Natoms: 5 Charge: 0 Mult: 1
0 O 0.000000 0.000000 0.000000
1 C 0.000000 0.000000 1.380000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: CH3 Method: Ext_lib
Natoms: 4 Charge: 0 Mult: 1
2 C 1.300000 0.000000 -0.460000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
------------------------------------------
Note
XZYFRAGLIB
allows the inclusion of up to 10 files. Each file may contain multiple fragment definitions; however, each definition must consist of a single, connected molecule—unconnected structures within a single definition are not allowed.The format of
Mylib.xyz
follows the standard XYZ file structure, but optionally supports three identifiers:CHARGE
,MULT
, andNAME
, withNAME
expected to appear last. These identifiers are printed when a fragment is recognized, though they are not currently used in any calculations.Subsequent modifications to fragments by other methods (see Fusebyatoms and Extend below) do not affect the fragment’s CHARGE or MULT values in the identifiers.
Important
ORCA’s VF2 subgraph isomorphism algorithm is based solely on atomic connectivity; it does not consider stereochemistry or bond orders.
Fragment assignment by
FragProc Extlib
follows the order of the files listed inXZYFRAGLIB
and the order of fragment definitions within each file. Since an atom is excluded from subsequent matching once it has been assigned to a fragment, the order in which fragments are defined in the library is critically important. For example, in the previous case, if CH3 is defined before CH3O inMylib.xyz
, both CH3 groups in the input file will be matched first. This prevents CH3O from being recognized later, even if it is present in the system.Atom assignment to fragments follows the order in which atoms appear in the geometry. In the previous example, the CH3O fragment could be assigned using either the C(1)-O or C(2)-O bond. However, since C(1) appears first in the geometry, it is selected for the CH3O fragment.
2.18.2.5. Automatic Fragmentation: Extend¶
Many fragments in the internal libraries represent incomplete molecular structures.
Therefore, in some cases, it is necessary to add additional hydrogen atoms—or, in
certain situations, a hydroxyl group—to complete the system. The FragProc Extend
addresses this by identifying oxygen atoms bonded to carbon atoms in previously assigned
fragments, as well as hydrogen atoms bonded to carbon, nitrogen, or oxygen.
It then extends the corresponding fragments to include these atoms.
Consider the example below, which consists of a zwitterionic glycine molecule fragmented
using an external library that defines a C–C–N fragment. In this case, FragProc Extend
can be applied to add the missing oxygen and hydrogen atoms, thereby completing the
molecular structure.
%frag
PrintLevel 3
FragProc Extlib, Extend
XZYFRAGLIB "Mylib.xyz"
end
*xyz 0 1
N 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.460000
C 1.403962 0.000000 2.015868
O 1.837767 0.962613 2.627089
O 2.161436 -0.947142 1.883370
H 0.423874 -0.764689 -0.525339
H -0.538292 -0.884247 1.831954
H -0.538292 0.884247 1.831954
H -0.970478 0.016940 -0.313504
H 0.499909 0.831989 -0.313504
*
Where Mylib.xyz
in this case is:
3
Name CCN
C 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.500000
N 1.376503 0.000000 -0.486661
The result is an initial assignment of the C–C–N fragment, followed by an extension of the fragment to include two oxygen atoms and five hydrogen atoms.
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CCN Method: Ext_lib
Natoms: 3 Charge: 0 Mult: 1
0 N 0.000000 0.000000 0.000000
1 C 0.000000 0.000000 1.460000
2 C 1.403962 0.000000 2.015868
------------------------------------------
===================================================
Tfragmentator: Extending Fragments
===================================================
Extending Fragment 0 with O (3)
Extending Fragment 0 with O (4)
Extending Fragment 0 with H (6)
Extending Fragment 0 with H (7)
Extending Fragment 0 with H (5)
Extending Fragment 0 with H (8)
Extending Fragment 0 with H (9)
Important
FragProc Extend
attempts to extend any previously assigned fragment.FragProc Extend
does not modify theCharge
andMult
identifiers of fragments.
2.18.2.6. Automatic Fragmentation: Fusebyatoms¶
When multiple fragmentation procedures are used in combination, it may be
necessary to merge two previously assigned fragments into one. This can be
achieved using the FragProc Fusebyatoms
procedure, which identifies atom
pairs that should belong to the same fragment, as specified in FuseAtomPairs
.
In the example below, the objective is to fragment a propane molecule into a
CH3 group and a CH3–CH2 fragment. One way to
achieve this is by first applying FragProc FunctionalGroups
, which fragments
the molecule into two CH3 groups and one CH2 group.
The CH2 group can then be merged with one of the CH3 groups.
This is done by specifying the carbon atoms of the CH2 and one
CH3 group in the FuseAtomPairs
directive: FuseAtomPairs {0 1} end
.
%frag
printlevel 3
FragProc FunctionalGroups, Fusebyatoms
FuseAtomPairs {0 1} end
end
*xyz 0 1
C 0.000000 1.270000 -0.260000
C 0.000000 -0.000000 0.580000
C 0.000000 -1.270000 -0.260000
H -0.890000 1.320000 -0.910000
H 0.890000 1.320000 -0.910000
H 0.000000 2.180000 0.370000
H 0.880000 -0.000000 1.250000
H -0.880000 -0.000000 1.250000
H 0.890000 -1.320000 -0.910000
H 0.000000 -2.180000 0.370000
H -0.880000 -1.300000 -0.910000
*
The output from the fragmentator first reports the initial fragmentation
performed by the Functional_groups
library, followed by a message
indicating the subsequent fusion of fragments as specified by the
FuseAtomPairs
directive.
===================================================
Tfragmentator: Fragmenting by Functional_groups
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
0 C 0.000000 1.270000 -0.260000
3 H -0.890000 1.320000 -0.910000
4 H 0.890000 1.320000 -0.910000
5 H 0.000000 2.180000 0.370000
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 2
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
2 C 0.000000 -1.270000 -0.260000
8 H 0.890000 -1.320000 -0.910000
9 H 0.000000 -2.180000 0.370000
10 H -0.880000 -1.300000 -0.910000
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 3
------------------------------------------
Name: CH2 Method: Functional_groups
Natoms: 3 Charge: 0 Mult: 1
1 C 0.000000 -0.000000 0.580000
6 H 0.880000 -0.000000 1.250000
7 H -0.880000 -0.000000 1.250000
------------------------------------------
===================================================
Tfragmentator: Fusing Fragments
===================================================
Fusing Fragment 0 (atom 0) with Fragment 2 (atom 1)
2.18.2.7. Automatic Fragmentation: Delete and advanced fragmentation workflows¶
Almost all fragmentation schemes have an associated delete procedure, which removes the fragments generated by that specific scheme. This enables the construction of advanced fragmentation workflows, where a procedure can temporarily “protect” certain atoms from being fragmented by subsequent methods. These protected fragments can later be deleted, allowing the atoms to be reprocessed using a different fragmentation approach.
Consider the system below, which consists of both a phenylalanine molecule and a benzene molecule. The goal is to fragment the phenylalanine into its backbone, one CH2 group, and the phenyl ring, while fragmenting the benzene into six individual CH fragments.
On one hand, by combining the three fragmentation procedures — FragProc AABackbone, Extend, AASCFinegrained
— the desired fragmentation of phenylalanine can be achieved. The AABackbone
and Extend
options identify the aminoacid backbone, while AASCFinegrained
separates the
CH2 group and the phenyl ring into distinct fragments.
Meanwhile, the benzene molecule can be fragmented into six individual CH fragments using
FragProc Extlib
, which relies on an external library defining the CH fragment.
However, these two fragmentation schemes are incompatible. When FragProc AASCFinegrained
is applied to the entire system, it fragments not only the phenylalanine side chain but
also breaks the benzene ring into a phenyl group and a hydrogen atom. Conversely, if
FragProc Extlib
is applied first to benzene, it may inadvertently fragment the phenylalanine
residue, leading to undesired structural splits.
A viable approach to achieve the desired fragmentation involves first identifying the amino acid
using FragProc Aminoacids
. This procedure assigns all atoms that belong to amino acid fragments,
thereby protecting them from reassignment by subsequent fragmentation steps. The benzene molecule
can then be fragmented using an external library via FragProc Extlib
. Finally, the phenylalanine
fragment is removed using FragProc DelAminoacids
, allowing it to be re-fragmented as needed.
All these operations are executed by simply concatenating the procedures as follows:
FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained
%frag
PrintLevel 3
FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained
XZYFRAGLIB "Mylib.xyz"
STOREFRAGS true
end
*xyz 0 1
O 1.300780 3.093320 1.571260
C 0.407700 2.826000 0.770390
O -0.101890 3.606620 -0.030470
C -0.115350 1.396690 0.770390
N -1.564350 1.396690 0.770390
H -1.900780 1.872730 1.595200
H 0.242580 0.884560 1.663540
H -1.901490 0.444630 0.770390
H -1.901016 1.873069 -0.054118
C 0.436090 0.696200 -0.461750
H 1.512450 0.863150 -0.502770
H -0.018950 1.150700 -1.341790
C 0.178410 -0.790680 -0.515340
C -0.895760 -1.288470 -1.262580
C -1.134680 -2.667040 -1.312270
C -0.299410 -3.547820 -0.614730
C 0.774760 -3.050040 0.132500
C 1.013680 -1.671470 0.182200
H 1.842330 -1.287460 0.758630
H 1.419110 -3.729500 0.670600
H -0.483720 -4.611290 -0.653070
H -1.963330 -3.051050 -1.888700
H -1.540110 -0.609010 -1.800680
C 1.181325 -3.732146 -2.620584
C 2.016592 -4.612923 -1.923046
C 3.090772 -4.115131 -1.175824
C 3.329684 -2.736563 -1.126139
C 2.494417 -1.855785 -1.823677
C 1.420237 -2.353577 -2.570900
H 0.770518 -1.668458 -3.113485
H 0.422252 -4.288369 -3.129823
H 1.830753 -5.685253 -1.961693
H 3.740491 -4.800250 -0.633239
H 4.165243 -2.349352 -0.544907
H 2.680256 -0.783455 -1.785030
*
Where “Mylib.xyz” in this case is:
2
NAME CH
C 0.00 0.00 0.00
H 1.08 0.00 0.00
The complete fragmentation sequence follows the same structure as in previous examples.
In this case, the message Deleted 1 fragments
is printed after the fragment assignment
performed by Ext_lib
, indicating that a fragment previously defined by
FragProc Aminoacids
has been successfully removed.
===================================================
Tfragmentator: Fragmenting by Amino_Acids
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CPHE Method: Amino_Acids
Natoms: 21 Charge: -1 Mult: 1
0 O 1.300780 3.093320 1.571260
1 C 0.407700 2.826000 0.770390
2 O -0.101890 3.606620 -0.030470
3 C -0.115350 1.396690 0.770390
4 N -1.564350 1.396690 0.770390
5 H -1.900780 1.872730 1.595200
6 H 0.242580 0.884560 1.663540
9 C 0.436090 0.696200 -0.461750
10 H 1.512450 0.863150 -0.502770
11 H -0.018950 1.150700 -1.341790
12 C 0.178410 -0.790680 -0.515340
13 C -0.895760 -1.288470 -1.262580
14 C -1.134680 -2.667040 -1.312270
15 C -0.299410 -3.547820 -0.614730
16 C 0.774760 -3.050040 0.132500
17 C 1.013680 -1.671470 0.182200
18 H 1.842330 -1.287460 0.758630
19 H 1.419110 -3.729500 0.670600
20 H -0.483720 -4.611290 -0.653070
21 H -1.963330 -3.051050 -1.888700
22 H -1.540110 -0.609010 -1.800680
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Ext_lib
===================================================
****
**** There are 1 Ref structures found in file Mylib.xyz
****
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: CH Method: Ext_lib
Natoms: 2 Charge: 0 Mult: 1
23 C 1.181325 -3.732146 -2.620584
30 H 0.422252 -4.288369 -3.129823
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 3
------------------------------------------
Name: CH Method: Ext_lib
Natoms: 2 Charge: 0 Mult: 1
24 C 2.016592 -4.612923 -1.923046
31 H 1.830753 -5.685253 -1.961693
------------------------------------------
...
------------------------------------------
Match: 6, Assigned Fragment: 7
------------------------------------------
Name: CH Method: Ext_lib
Natoms: 2 Charge: 0 Mult: 1
28 C 1.420237 -2.353577 -2.570900
29 H 0.770518 -1.668458 -3.113485
------------------------------------------
===================================================
Tfragmentator: Deleting Fragments (Amino_Acids)
===================================================
Deleted 1 fragments
===================================================
Tfragmentator: Fragmenting by Backbone
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 7
------------------------------------------
Name: CO-NH3-CH Method: AA_Backbone
Natoms: 8 Charge: 0 Mult: 0
0 O 1.300780 3.093320 1.571260
1 C 0.407700 2.826000 0.770390
3 C -0.115350 1.396690 0.770390
4 N -1.564350 1.396690 0.770390
5 H -1.900780 1.872730 1.595200
6 H 0.242580 0.884560 1.663540
7 H -1.901490 0.444630 0.770390
8 H -1.901016 1.873069 -0.054118
------------------------------------------
===================================================
Tfragmentator: Extending Fragments
===================================================
Extending Fragment 6 with O (2)
===================================================
Tfragmentator: Fragmenting by AA_SideChains_FG
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 8
------------------------------------------
Name: Ph Method: AA_SideChains_FG
Natoms: 11 Charge: 0 Mult: 1
12 C 0.178410 -0.790680 -0.515340
13 C -0.895760 -1.288470 -1.262580
14 C -1.134680 -2.667040 -1.312270
15 C -0.299410 -3.547820 -0.614730
16 C 0.774760 -3.050040 0.132500
17 C 1.013680 -1.671470 0.182200
18 H 1.842330 -1.287460 0.758630
19 H 1.419110 -3.729500 0.670600
20 H -0.483720 -4.611290 -0.653070
21 H -1.963330 -3.051050 -1.888700
22 H -1.540110 -0.609010 -1.800680
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 9
------------------------------------------
Name: CH2 Method: AA_SideChains_FG
Natoms: 3 Charge: 0 Mult: 1
9 C 0.436090 0.696200 -0.461750
10 H 1.512450 0.863150 -0.502770
11 H -0.018950 1.150700 -1.341790
------------------------------------------
Table Table 2.61 lists the corresponding delete procedure associated with each fragmentation method.
Fragment detection Keyword |
Fragment deletion Keyword |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2.18.3. Options available in the %frag
input block¶
Table Table 2.62 contains a list of the options available in the %frag
input block.
Option |
Type |
Default |
Description |
---|---|---|---|
|
Integer |
|
Verbose output control for automated fragmentation. |
|
Boolean |
|
Stores assigned fragments in a |
|
Boolean |
|
Automatically detects bonds between fragments for CoVaLED analysis. |
|
String |
|
Filenames used in |
|
See Table 2.60, and Table 2.61 |
|
Fragmentation procedures to be applied automatically. |
|
Boolean |
|
Generate main geometry graph based on .prms file. |
|
String |
|
Topology file name to been used when |
|
Boolean |
|
Writes a |