(sec:essentialelements.fragmentation)=
# Fragment Specification
Atoms in a calculation can be grouped into specific *fragments*,
which serve multiple purposes.
Fragment definitions can be used to [assign Basis Sets and ECPs](#sec:essentialelements.basisset.fragments),
organize output in the [population analysis section](#sec:spectroscopyproperties.pop.lsa),
and enable features like [fragment constrain optimization](#sec:structurereactivity.geomopt.constrainedfragment)
and [Rigid Body Optimization](#sec:structurereactivity.optimization.rigidbody).
They are also used in [local energy decomposition](#sec:spectroscopyproperties.led)
and [multi-level calculations](#sec:modelchemistries.mdci.multilevel).
(sec:essentialelements.fragmentation.Inputdef)=
## Fragments defined on Input File
There are three ways to assign atoms to fragments using the input file.
The first method is to assign a specific atom to a specific fragment by placing
`(n)` directly after the atomic symbol in the coordinates section.
```orca
*xyz -2 2
Cu(1) 0.00 0.00 0.00
Cl(2) 2.25 0.00 0.00
Cl(2) -2.25 0.00 0.00
Cl(2) 0.00 2.25 0.00
Cl(2) 0.00 -2.25 0.00
*
```
In this example the fragment feature is used to divide the molecule into
a "metal" and a "ligand" fragment and consequently the program will
print the metal and ligand charges and populations.
```orca
----------------------------------------------
CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM)
----------------------------------------------
FRAGMENT 1
Cu 0.000000 0.000000 0.000000
FRAGMENT 2
Cl 2.250000 0.000000 0.000000
Cl -2.250000 0.000000 0.000000
Cl 0.000000 2.250000 0.000000
Cl 0.000000 -2.250000 0.000000
...
----------------------------------------------
MULLIKEN FRAGMENT CHARGES AND SPIN POPULATIONS
----------------------------------------------
Fragment 0 : 0.752589 0.842580
Fragment 1 : -2.752589 0.157420
Sum of fragment charges : -2.0000000
Sum of fragment spin populations: 1.0000000
...
--------------------------------------------
LOEWDIN FRAGMENT CHARGES AND SPIN POPULATONS
--------------------------------------------
Fragment 0 : 0.222028 0.851552
Fragment 1 : -2.222028 0.148448
```
Alternatively, the `%coords` block can be used for fragment definitions in the
same way—by placing `(n) ` directly after the atomic symbol.
```orca
%coords
CTyp xyz # the type of coordinates xyz or internal
Charge -2 # the total charge of the molecule
Mult 2 # the multiplicity = 2S+1
coords
Cu(1) 0.00 0.00 0.00
Cl(2) 2.25 0.00 0.00
Cl(2) -2.25 0.00 0.00
Cl(2) 0.00 2.25 0.00
Cl(2) 0.00 -2.25 0.00
end
end
```
:::{important}
- In cases where all atoms are explicitly assigned to fragments, the fragment numbering
must start at 1 and use consecutive integers. Non-consecutive or incorrect numbering
may lead to errors.
- If any atom is left unassigned (or explicitly assigned to fragment 0), it will be
automatically assigned to a fragment using the fragmentation procedure described in
[Automatic Fragmentation](#sec:essentialelements.fragmentation.inputfile) section.
In such cases—where only a subset of atoms is manually assigned to fragments—it is not
necessary to use consecutive fragment numbers. However, the highest fragment number
specified in the input must be less than the total number of fragments generated by the
combination of manual and automatic procedures.
If this condition is not met, ORCA will automatically reorder all fragment numbers in
ascending order, starting from 1.
:::
Finally, a third way to define fragments consists of using a `Definition` inside
the `%frag` block. In this scheme, the fragment number comes first, followed by a
list of atoms (enumerated starting from 0) enclosed in curly brackets `{}` and finishing with `end`.
Consecutive atoms can also be specified using the notation `initial_atom`:`final_atom`.
```orca
*xyz -2 2
Cu 0.00 0.00 0.00
Cl 2.25 0.00 0.00
Cl -2.25 0.00 0.00
Cl 0.00 2.25 0.00
Cl 0.00 -2.25 0.00
*
%frag
Definition
1 {0} end # atom 0 for fragment 1
2 {1:4} end # atoms 1 to 4 for fragment 2
end
end
```
```orca
*xyz -2 2
Cl 2.25 0.00 0.00
Cl -2.25 0.00 0.00
Cu 0.00 0.00 0.00
Cl 0.00 2.25 0.00
Cl 0.00 -2.25 0.00
*
%frag
Definition
1 {2} end # atom 2 for fragment 1
2 {0:1 3:4} end # atoms 0, 1, 3, and 4 for fragment 2
end
end
```
:::{Note}
- With the last option (`Definition`) the `%frag` block has to be
written after the coordinate section.
- `%frag Definition` also works with coordinates that are defined via an
external file.
:::
(sec:essentialelements.fragmentation.inputfile)=
## Automatic Fragmentation
Starting with ORCA 6.1, a set of automatic fragmentation algorithms has been
introduced to recognize and group atoms into fragments automatically.
The automatic fragmentation procedure is triggered by including a `%frag`
block in the input file or when only a subset of atoms has been manually
assigned to fragments.
Automatic fragmentation is performed using a set of procedures that can be
selected by the user within the `%frag` block. Each procedure attempts to
identify new fragments among atoms that have not yet been assigned. For
example, consider a case below where the first 11 atoms belong to a propane
molecule and the last three belong to a water molecule. The `Water` procedure
in `FragProc` identifies the last three atoms as a water molecule and assigns
them to the first fragment, while the `FunctionalGroups` procedure detects
and assigns the CH3 and CH2 groups of propane as fragments 2 to 4.
(sec:essentialelements.fragmentation.intlibex)=
```orca
%frag
FragProc Water, FunctionalGroups
end
* xyz 0 1
C 0.000000 1.270000 -0.260000
C 0.000000 -0.000000 0.580000
C 0.000000 -1.270000 -0.260000
H -0.890000 1.320000 -0.910000
H 0.890000 1.320000 -0.910000
H 0.000000 2.180000 0.370000
H 0.880000 -0.000000 1.250000
H -0.880000 -0.000000 1.250000
H 0.890000 -1.320000 -0.910000
H 0.000000 -2.180000 0.370000
H -0.880000 -1.300000 -0.910000
H -0.920000 0.850000 -2.430000
O -1.690000 0.830000 -3.000000
H -1.640000 1.630000 -3.510000
*
```
```orca
----------------------------------------------
CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM)
----------------------------------------------
FRAGMENT 1
H -0.920000 0.850000 -2.430000
O -1.690000 0.830000 -3.000000
H -1.640000 1.630000 -3.510000
FRAGMENT 2
C 0.000000 1.270000 -0.260000
H -0.890000 1.320000 -0.910000
H 0.890000 1.320000 -0.910000
H 0.000000 2.180000 0.370000
FRAGMENT 3
C 0.000000 -1.270000 -0.260000
H 0.890000 -1.320000 -0.910000
H 0.000000 -2.180000 0.370000
H -0.880000 -1.300000 -0.910000
FRAGMENT 4
C 0.000000 -0.000000 0.580000
H 0.880000 -0.000000 1.250000
H -0.880000 -0.000000 1.250000
```
This automatic fragmentation yields the same result as the manual fragment definition
shown below, without the need to inspect the geometry and assign fragments manually.
```orca
%frag
Definition
1 { 11:13} end # water
2 { 0 3:5} end # CH3
3 { 2 8:10} end # CH3
4 { 1 6:7} end # CH2
end
end
```
:::{Note}
- Any fragment defined in the [input file](#sec:essentialelements.fragmentation.Inputdef)
take precedence over automatic assignments.
- ORCA supports up to 10 procedures in `FragProc`, with the full list
provided in {numref}`tab:essentialelements.fragmentationprocedures` and
{numref}`tab:essentialelements.delprocedures`.
- [Constrained Fragments](#sec:structurereactivity.geomopt.constrainedfragment)
do not enable automatically the Automatic Fragmentation when there are atoms
unassigned to fragments. However, automatic fragmentation can be activated
by including a `%frag` block in the input file.
:::
ORCA provides over 30 `FragProc` methods, which can be combined in a list to
generate fragments for various purposes.
Below are explanations and examples of the different procedures.
(sec:essentialelements.fragmentation.inputfile.connectivity)=
### Automatic Fragmentation: Connectivity
`FragProc Connectivity` groups atoms that are connected by chemical bonds,
estimated based on atomic radii. In the example below, the first nine atoms
belong to a dimethyl ether molecule, which is automatically detected and
assigned to one fragment, while the last three atoms—belonging to a water
molecule—are assigned to a second fragment.
```orca
%frag
PrintLevel 3
FragProc Connectivity
end
*xyz 0 1
O 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
```
In this case, the output of the automatic fragmentation tool indicates that
two fragments were assigned by the `Connectivity` procedure.
```orca
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: Connectivity:0 Method: Connectivity
Natoms: 9 Charge: 0 Mult: 1
0 O 0.000000 0.000000 0.000000
1 C 0.000000 0.000000 1.380000
2 C 1.300000 0.000000 -0.460000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
------------------------------------------
------------------------------------------
Assigned Fragment: 2
------------------------------------------
Name: Connectivity:1 Method: Connectivity
Natoms: 3 Charge: 0 Mult: 1
9 H -1.840000 0.000000 -0.650000
10 O -2.440000 0.000000 0.080000
11 H -3.300000 0.000000 -0.300000
------------------------------------------
```
Each fragmentation procedure applies only to atoms that have not already been
assigned to a fragment. Consequently, connectivity-based fragmentation will
ignore any bonds to atoms that have been assigned in the input file or by a
previous `FragProc`.
For example, if in the previous case example the oxygen atom is manually assigned
to fragment 1 in the input file, this manual assignment takes precedence over the
`FragProc Connectivity`. As a result, four fragments are created: one fragment
containing the manually assigned oxygen atom, two CH3 groups, and one
water molecule.
```orca
%frag
PrintLevel 3
FragProc Connectivity
end
*xyz 0 1
O(1) 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
```
The output of the automatic fragmentation tool indicates that the first
fragment was assigned by the `Orca_Input` procedure, while the remaining three
fragments were assigned by the `Connectivity` procedure.
```orca
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: User-defined:0 Method: Orca_input
Natoms: 1 Charge: 0 Mult: 1 InputId: 1
0 O 0.000000 0.000000 0.000000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Connectivity
===================================================
------------------------------------------
Assigned Fragment: 2
------------------------------------------
Name: Connectivity:1 Method: Connectivity
Natoms: 4 Charge: 0 Mult: 1
1 C 0.000000 0.000000 1.380000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
------------------------------------------
------------------------------------------
Assigned Fragment: 3
------------------------------------------
Name: Connectivity:2 Method: Connectivity
Natoms: 4 Charge: 0 Mult: 1
2 C 1.300000 0.000000 -0.460000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
------------------------------------------
------------------------------------------
Assigned Fragment: 4
------------------------------------------
Name: Connectivity:3 Method: Connectivity
Natoms: 3 Charge: 0 Mult: 1
9 H -1.840000 0.000000 -0.650000
10 O -2.440000 0.000000 0.080000
11 H -3.300000 0.000000 -0.300000
------------------------------------------
```
(sec:essentialelements.fragmentation.inputfile.atnoas)=
### Automatic Fragmentation: Atomic and NotAssigned
`FragProc Atomic` and `FragProc NotAssigned` are termination procedures.
`FragProc Atomic` assigns each previously unassigned atom to its own individual fragment,
while `FragProc NotAssigned` assigns all remaining unassigned atoms to a single fragment.
Similar to other `FragProc` methods, the output of the automatic fragmentation tool indicates
that the `Atomic` or `Not_assigned` procedure has been used to generate the corresponding fragments.
```orca
%frag
PrintLevel 3
FragProc Atomic
end
*xyz 0 1
O(1) 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
```
```orca
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: User-defined:0 Method: Orca_input
Natoms: 1 Charge: 0 Mult: 1 InputId: 1
0 O 0.000000 0.000000 0.000000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Atomic
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: C Method: Atomic
Natoms: 1 Charge: 0 Mult: 1
1 C 0.000000 0.000000 1.380000
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 3
------------------------------------------
Name: C Method: Atomic
Natoms: 1 Charge: 0 Mult: 1
2 C 1.300000 0.000000 -0.460000
------------------------------------------
...
------------------------------------------
Match: 11, Assigned Fragment: 12
------------------------------------------
Name: H Method: Atomic
Natoms: 1 Charge: 0 Mult: 1
11 H -3.300000 0.000000 -0.300000
------------------------------------------
```
```orca
%frag
PrintLevel 3
FragProc NotAssigned
end
*xyz 0 1
O(1) 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
H -1.840000 0.000000 -0.650000
O -2.440000 0.000000 0.080000
H -3.300000 0.000000 -0.300000
*
```
```orca
===================================================
Tfragmentator: Fragmenting by Orca_input
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: User-defined:0 Method: Orca_input
Natoms: 1 Charge: 0 Mult: 1 InputId: 1
0 O 0.000000 0.000000 0.000000
------------------------------------------
===================================================
Tfragmentator: Setting not assigned atoms to a fragment
===================================================
------------------------------------------
Assigned Fragment: 1
------------------------------------------
Name: Not Assigned Method: Not_assigned
Natoms: 11 Charge: 0 Mult: 1
1 C 0.000000 0.000000 1.380000
2 C 1.300000 0.000000 -0.460000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
9 H -1.840000 0.000000 -0.650000
10 O -2.440000 0.000000 0.080000
11 H -3.300000 0.000000 -0.300000
------------------------------------------
```
:::{Note}
- `FragProc NotAssigned` is always applied at the end of all `FragProc`
procedures to ensure that no atoms remain without a fragment assignment.
:::
(sec:essentialelements.fragmentation.inputfile.intlib)=
### Automatic Fragmentation: Internal Libraries
ORCA includes a series of internal libraries containing definitions
of many common molecular structures. In all cases, structure recognition
is performed using a VF2 subgraph isomorphism algorithm applied to
molecular graphs constructed from Cartesian coordinates.
The available fragmentation procedures that make use of internal libraries
are listed in {numref}`tab:essentialelements.fragmentationprocedures`.
(tab:essentialelements.fragmentationprocedures)=
:::{table} Simple input keywords for Fragment detection
| Fragment detection Keyword | Description |
|:---------------------------|:---------------------------|
| `FunctionalGroups` | Contains a list of the most common organic functional groups. |
| `Aminoacids` | Contains a list of all amino acids, including all common protonation states, but excluding zwitterionic forms. |
| `AABackbone` | Contains fragment definitions for amino acid backbone detection. |
| `Backbone` | Performs `AABackbone` followed by merging all fragments into a single protein backbone fragment. |
| `SeqBackbone` | Similar to `AABackbone`, but peptide bonds are assigned as separate fragments. |
| `AASidechains` | Contains a list of all amino acid side chains. |
| `AASCFinegrained` | Contains a detailed list of organic functional groups within amino acid side chains. |
| `NABackbone` | Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 3′ position. |
| `SEQNABackbone` | Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 5′ position. |
| `NABBFinegrained` | Same as `NABackbone`, but further splits the phosphate group. |
| `Nucleoticacid` | Contains a list of all nucleic acids. |
| `NASidechains` | Contains a list of all nucleic acid side chains. |
| `Solvents` | Contains definitions for common solvents: 1-octanol, n-hexane, cyclohexane, toluene, chlorobenzene, tetrahydrofuran, benzene, N,N-dimethylformamide, pyridine, dimethyl sulfoxide, acetone, ethanol, acetonitrile, methanol, chloroform, carbon tetrachloride, dichloromethane, ammonia, and water. |
| `Water` | Contains definitions of water molecules, faster than `Solvents` if only fragment water molecules. |
:::
Similar to other fragmentation procedures, listing multiple libraries in
`FragProc` will apply the procedures consecutively. Using the first example
listed in [Automatic Fragmentation](#sec:essentialelements.fragmentation.inputfile)
([input](#sec:essentialelements.fragmentation.intlibex)), and including
`PrintLevel 3` in the `%frag` block to increase verbosity, the output
will indicate the assignment of fragments by the `Water` and `Functional_groups`
procedures.
```orca
===================================================
Tfragmentator: Fragmenting by Water
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: WATER Method: Water
Natoms: 3 Charge: 0 Mult: 1
11 H -0.920000 0.850000 -2.430000
12 O -1.690000 0.830000 -3.000000
13 H -1.640000 1.630000 -3.510000
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Functional_groups
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
0 C 0.000000 1.270000 -0.260000
3 H -0.890000 1.320000 -0.910000
4 H 0.890000 1.320000 -0.910000
5 H 0.000000 2.180000 0.370000
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 3
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
2 C 0.000000 -1.270000 -0.260000
8 H 0.890000 -1.320000 -0.910000
9 H 0.000000 -2.180000 0.370000
10 H -0.880000 -1.300000 -0.910000
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 4
------------------------------------------
Name: CH2 Method: Functional_groups
Natoms: 3 Charge: 0 Mult: 1
1 C 0.000000 -0.000000 0.580000
6 H 0.880000 -0.000000 1.250000
7 H -0.880000 -0.000000 1.250000
------------------------------------------
```
(sec:essentialelements.fragmentation.inputfile.externallib)=
### Automatic Fragmentation: External Libraries
The automated fragmentator also allows users to supply `.xyz` files via the
`XZYFRAGLIB` variable in `%frag` block, containing geometries that should
be recognized as fragments. The `FragProc Extlib` procedure automatically
converts each provided geometry into a molecular graph and then applies a
VF2 subgraph isomorphism algorithm—just as is done with the internal
libraries.
The following example uses definitions of CH3O and CH3 fragments
from the file `Mylib.xyz` to identify and generate fragments within the
dimethyl ether geometry provided below:
(sec:essentialelements.fragmentation.intextlib)=
```orca
%frag
PrintLevel 3
FragProc Extlib
XZYFRAGLIB "Mylib.xyz"
end
*xyz 0 1
O 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
C 1.300000 0.000000 -0.460000
H -0.500000 0.870000 1.740000
H -0.500000 -0.870000 1.740000
H 1.000000 0.000000 1.740000
H 1.300000 0.000000 -1.530000
H 1.800000 0.870000 -0.100000
H 1.800000 -0.870000 -0.100000
*
```
Where `Mylib.xyz` contains definitions for methyl and methoxy groups.
```orca
5
CHARGE 0 MULT 1 NAME CH3O
O 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.380000
H 1.008807 0.000000 1.736663
H -0.328435 0.953845 1.736667
H -0.794950 -0.621083 1.736667
4
CHARGE 0 MULT 1 NAME CH3
C 0.000000 0.000000 0.000000
H 0.000000 0.000000 1.070000
H 1.008807 0.000000 -0.356663
H -0.328435 -0.953845 -0.356667
```
The result is the assignment of atoms 0, 1, 3, 4, and 5 to a
CH3O fragment, with the remaining atoms assigned to a
CH3 by the `Ext_lib` procedure.
```orca
===================================================
Tfragmentator: Fragmenting by Ext_lib
===================================================
****
**** There are 2 Ref structures found in file Mylib.xyz
****
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CH3O Method: Ext_lib
Natoms: 5 Charge: 0 Mult: 1
0 O 0.000000 0.000000 0.000000
1 C 0.000000 0.000000 1.380000
3 H -0.500000 0.870000 1.740000
4 H -0.500000 -0.870000 1.740000
5 H 1.000000 0.000000 1.740000
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: CH3 Method: Ext_lib
Natoms: 4 Charge: 0 Mult: 1
2 C 1.300000 0.000000 -0.460000
6 H 1.300000 0.000000 -1.530000
7 H 1.800000 0.870000 -0.100000
8 H 1.800000 -0.870000 -0.100000
------------------------------------------
```
:::{Note}
- `XZYFRAGLIB` allows the inclusion of up to 10 files. Each file may
contain multiple fragment definitions; however, each definition must
consist of a single, connected molecule—unconnected structures within
a single definition are not allowed.
- The format of `Mylib.xyz` follows the standard XYZ file structure,
but optionally supports three identifiers: `CHARGE`, `MULT`, and `NAME`, with
`NAME` expected to appear last. These identifiers are printed when a
fragment is recognized, though they are not currently used in any calculations.
- Subsequent modifications to fragments by other methods (see
[Fusebyatoms](#sec:essentialelements.fragmentation.inputfile.fuse)
and [Extend](#sec:essentialelements.fragmentation.inputfile.extend) below)
do not affect the fragment's CHARGE or MULT values in the identifiers.
:::
:::{important}
- ORCA’s VF2 subgraph isomorphism algorithm is based solely on atomic
connectivity; it does not consider stereochemistry or bond orders.
- Fragment assignment by `FragProc Extlib` follows the order of the
files listed in `XZYFRAGLIB` and the order of fragment definitions within
each file. Since an atom is excluded from subsequent matching once it has
been assigned to a fragment, the order in which fragments are defined in
the library is critically important. For example, in the previous
[case](#sec:essentialelements.fragmentation.intextlib), if CH3
is defined before CH3O in `Mylib.xyz`, both
CH3 groups in the [input file](#sec:essentialelements.fragmentation.intextlib)
will be matched first. This prevents CH3O from being recognized
later, even if it is present in the system.
- Atom assignment to fragments follows the order in which atoms appear in the
geometry. In the [previous example](#sec:essentialelements.fragmentation.intextlib),
the CH3O fragment could be assigned using either the C(1)-O or C(2)-O bond.
However, since C(1) appears first in the geometry, it is selected for the CH3O
fragment.
:::
(sec:essentialelements.fragmentation.inputfile.extend)=
### Automatic Fragmentation: Extend
Many fragments in the internal libraries represent incomplete molecular structures.
Therefore, in some cases, it is necessary to add additional hydrogen atoms—or, in
certain situations, a hydroxyl group—to complete the system. The `FragProc Extend`
addresses this by identifying oxygen atoms bonded to carbon atoms in previously assigned
fragments, as well as hydrogen atoms bonded to carbon, nitrogen, or oxygen.
It then extends the corresponding fragments to include these atoms.
Consider the example below, which consists of a zwitterionic glycine molecule fragmented
using an external library that defines a C–C–N fragment. In this case, `FragProc Extend`
can be applied to add the missing oxygen and hydrogen atoms, thereby completing the
molecular structure.
```orca
%frag
PrintLevel 3
FragProc Extlib, Extend
XZYFRAGLIB "Mylib.xyz"
end
*xyz 0 1
N 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.460000
C 1.403962 0.000000 2.015868
O 1.837767 0.962613 2.627089
O 2.161436 -0.947142 1.883370
H 0.423874 -0.764689 -0.525339
H -0.538292 -0.884247 1.831954
H -0.538292 0.884247 1.831954
H -0.970478 0.016940 -0.313504
H 0.499909 0.831989 -0.313504
*
```
Where `Mylib.xyz` in this case is:
```orca
3
Name CCN
C 0.000000 0.000000 0.000000
C 0.000000 0.000000 1.500000
N 1.376503 0.000000 -0.486661
```
The result is an initial assignment of the C–C–N fragment, followed by an extension
of the fragment to include two oxygen atoms and five hydrogen atoms.
```orca
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CCN Method: Ext_lib
Natoms: 3 Charge: 0 Mult: 1
0 N 0.000000 0.000000 0.000000
1 C 0.000000 0.000000 1.460000
2 C 1.403962 0.000000 2.015868
------------------------------------------
===================================================
Tfragmentator: Extending Fragments
===================================================
Extending Fragment 0 with O (3)
Extending Fragment 0 with O (4)
Extending Fragment 0 with H (6)
Extending Fragment 0 with H (7)
Extending Fragment 0 with H (5)
Extending Fragment 0 with H (8)
Extending Fragment 0 with H (9)
```
:::{important}
- `FragProc Extend` attempts to extend any previously assigned fragment.
- `FragProc Extend` does not modify the `Charge` and `Mult` identifiers of fragments.
:::
(sec:essentialelements.fragmentation.inputfile.fuse)=
### Automatic Fragmentation: Fusebyatoms
When multiple fragmentation procedures are used in combination, it may be
necessary to merge two previously assigned fragments into one. This can be
achieved using the `FragProc Fusebyatoms` procedure, which identifies atom
pairs that should belong to the same fragment, as specified in `FuseAtomPairs`.
In the example below, the objective is to fragment a propane molecule into a
CH3 group and a CH3–CH2 fragment. One way to
achieve this is by first applying `FragProc FunctionalGroups`, which fragments
the molecule into two CH3 groups and one CH2 group.
The CH2 group can then be merged with one of the CH3 groups.
This is done by specifying the carbon atoms of the CH2 and one
CH3 group in the `FuseAtomPairs` directive: `FuseAtomPairs {0 1} end`.
```orca
%frag
printlevel 3
FragProc FunctionalGroups, Fusebyatoms
FuseAtomPairs {0 1} end
end
*xyz 0 1
C 0.000000 1.270000 -0.260000
C 0.000000 -0.000000 0.580000
C 0.000000 -1.270000 -0.260000
H -0.890000 1.320000 -0.910000
H 0.890000 1.320000 -0.910000
H 0.000000 2.180000 0.370000
H 0.880000 -0.000000 1.250000
H -0.880000 -0.000000 1.250000
H 0.890000 -1.320000 -0.910000
H 0.000000 -2.180000 0.370000
H -0.880000 -1.300000 -0.910000
*
```
The output from the fragmentator first reports the initial fragmentation
performed by the `Functional_groups` library, followed by a message
indicating the subsequent fusion of fragments as specified by the
`FuseAtomPairs` directive.
```orca
===================================================
Tfragmentator: Fragmenting by Functional_groups
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
0 C 0.000000 1.270000 -0.260000
3 H -0.890000 1.320000 -0.910000
4 H 0.890000 1.320000 -0.910000
5 H 0.000000 2.180000 0.370000
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 2
------------------------------------------
Name: CH3 Method: Functional_groups
Natoms: 4 Charge: 0 Mult: 1
2 C 0.000000 -1.270000 -0.260000
8 H 0.890000 -1.320000 -0.910000
9 H 0.000000 -2.180000 0.370000
10 H -0.880000 -1.300000 -0.910000
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 3
------------------------------------------
Name: CH2 Method: Functional_groups
Natoms: 3 Charge: 0 Mult: 1
1 C 0.000000 -0.000000 0.580000
6 H 0.880000 -0.000000 1.250000
7 H -0.880000 -0.000000 1.250000
------------------------------------------
===================================================
Tfragmentator: Fusing Fragments
===================================================
Fusing Fragment 0 (atom 0) with Fragment 2 (atom 1)
```
(sec:essentialelements.fragmentation.inputfile.delete)=
### Automatic Fragmentation: Delete and advanced fragmentation workflows
Almost all fragmentation schemes have an associated delete procedure, which
removes the fragments generated by that specific scheme. This enables the construction
of advanced fragmentation workflows, where a procedure can temporarily "protect" certain
atoms from being fragmented by subsequent methods. These protected fragments can later be
deleted, allowing the atoms to be reprocessed using a different fragmentation approach.
Consider the system [below](#sec:essentialelements.fragmentation.intdelaa), which
consists of both a phenylalanine molecule and a benzene molecule. The goal is to
fragment the phenylalanine into its backbone, one CH2 group, and the phenyl
ring, while fragmenting the benzene into six individual CH fragments.
On one hand, by combining the three fragmentation procedures — `FragProc AABackbone, Extend, AASCFinegrained`
— the desired fragmentation of phenylalanine can be achieved. The `AABackbone` and `Extend`
options identify the aminoacid backbone, while `AASCFinegrained` separates the
CH2 group and the phenyl ring into distinct fragments.
Meanwhile, the benzene molecule can be fragmented into six individual CH fragments using
`FragProc Extlib`, which relies on an external library defining the CH fragment.
However, these two fragmentation schemes are incompatible. When `FragProc AASCFinegrained`
is applied to the entire system, it fragments not only the phenylalanine side chain but
also breaks the benzene ring into a phenyl group and a hydrogen atom. Conversely, if
`FragProc Extlib` is applied first to benzene, it may inadvertently fragment the phenylalanine
residue, leading to undesired structural splits.
A viable approach to achieve the desired fragmentation involves first identifying the amino acid
using `FragProc Aminoacids`. This procedure assigns all atoms that belong to amino acid fragments,
thereby protecting them from reassignment by subsequent fragmentation steps. The benzene molecule
can then be fragmented using an external library via `FragProc Extlib`. Finally, the phenylalanine
fragment is removed using `FragProc DelAminoacids`, allowing it to be re-fragmented as needed.
All these operations are executed by simply concatenating the procedures as follows:
`FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained`
(sec:essentialelements.fragmentation.intdelaa)=
```orca
%frag
PrintLevel 3
FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained
XZYFRAGLIB "Mylib.xyz"
STOREFRAGS true
end
*xyz 0 1
O 1.300780 3.093320 1.571260
C 0.407700 2.826000 0.770390
O -0.101890 3.606620 -0.030470
C -0.115350 1.396690 0.770390
N -1.564350 1.396690 0.770390
H -1.900780 1.872730 1.595200
H 0.242580 0.884560 1.663540
H -1.901490 0.444630 0.770390
H -1.901016 1.873069 -0.054118
C 0.436090 0.696200 -0.461750
H 1.512450 0.863150 -0.502770
H -0.018950 1.150700 -1.341790
C 0.178410 -0.790680 -0.515340
C -0.895760 -1.288470 -1.262580
C -1.134680 -2.667040 -1.312270
C -0.299410 -3.547820 -0.614730
C 0.774760 -3.050040 0.132500
C 1.013680 -1.671470 0.182200
H 1.842330 -1.287460 0.758630
H 1.419110 -3.729500 0.670600
H -0.483720 -4.611290 -0.653070
H -1.963330 -3.051050 -1.888700
H -1.540110 -0.609010 -1.800680
C 1.181325 -3.732146 -2.620584
C 2.016592 -4.612923 -1.923046
C 3.090772 -4.115131 -1.175824
C 3.329684 -2.736563 -1.126139
C 2.494417 -1.855785 -1.823677
C 1.420237 -2.353577 -2.570900
H 0.770518 -1.668458 -3.113485
H 0.422252 -4.288369 -3.129823
H 1.830753 -5.685253 -1.961693
H 3.740491 -4.800250 -0.633239
H 4.165243 -2.349352 -0.544907
H 2.680256 -0.783455 -1.785030
*
```
Where "Mylib.xyz" in this case is:
```orca
2
NAME CH
C 0.00 0.00 0.00
H 1.08 0.00 0.00
```
The complete fragmentation sequence follows the same structure as in previous examples.
In this case, the message `Deleted 1 fragments` is printed after the fragment assignment
performed by `Ext_lib`, indicating that a fragment previously defined by
`FragProc Aminoacids` has been successfully removed.
```orca
===================================================
Tfragmentator: Fragmenting by Amino_Acids
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 1
------------------------------------------
Name: CPHE Method: Amino_Acids
Natoms: 21 Charge: -1 Mult: 1
0 O 1.300780 3.093320 1.571260
1 C 0.407700 2.826000 0.770390
2 O -0.101890 3.606620 -0.030470
3 C -0.115350 1.396690 0.770390
4 N -1.564350 1.396690 0.770390
5 H -1.900780 1.872730 1.595200
6 H 0.242580 0.884560 1.663540
9 C 0.436090 0.696200 -0.461750
10 H 1.512450 0.863150 -0.502770
11 H -0.018950 1.150700 -1.341790
12 C 0.178410 -0.790680 -0.515340
13 C -0.895760 -1.288470 -1.262580
14 C -1.134680 -2.667040 -1.312270
15 C -0.299410 -3.547820 -0.614730
16 C 0.774760 -3.050040 0.132500
17 C 1.013680 -1.671470 0.182200
18 H 1.842330 -1.287460 0.758630
19 H 1.419110 -3.729500 0.670600
20 H -0.483720 -4.611290 -0.653070
21 H -1.963330 -3.051050 -1.888700
22 H -1.540110 -0.609010 -1.800680
------------------------------------------
===================================================
Tfragmentator: Fragmenting by Ext_lib
===================================================
****
**** There are 1 Ref structures found in file Mylib.xyz
****
------------------------------------------
Match: 1, Assigned Fragment: 2
------------------------------------------
Name: CH Method: Ext_lib
Natoms: 2 Charge: 0 Mult: 1
23 C 1.181325 -3.732146 -2.620584
30 H 0.422252 -4.288369 -3.129823
------------------------------------------
------------------------------------------
Match: 2, Assigned Fragment: 3
------------------------------------------
Name: CH Method: Ext_lib
Natoms: 2 Charge: 0 Mult: 1
24 C 2.016592 -4.612923 -1.923046
31 H 1.830753 -5.685253 -1.961693
------------------------------------------
...
------------------------------------------
Match: 6, Assigned Fragment: 7
------------------------------------------
Name: CH Method: Ext_lib
Natoms: 2 Charge: 0 Mult: 1
28 C 1.420237 -2.353577 -2.570900
29 H 0.770518 -1.668458 -3.113485
------------------------------------------
===================================================
Tfragmentator: Deleting Fragments (Amino_Acids)
===================================================
Deleted 1 fragments
===================================================
Tfragmentator: Fragmenting by Backbone
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 7
------------------------------------------
Name: CO-NH3-CH Method: AA_Backbone
Natoms: 8 Charge: 0 Mult: 0
0 O 1.300780 3.093320 1.571260
1 C 0.407700 2.826000 0.770390
3 C -0.115350 1.396690 0.770390
4 N -1.564350 1.396690 0.770390
5 H -1.900780 1.872730 1.595200
6 H 0.242580 0.884560 1.663540
7 H -1.901490 0.444630 0.770390
8 H -1.901016 1.873069 -0.054118
------------------------------------------
===================================================
Tfragmentator: Extending Fragments
===================================================
Extending Fragment 6 with O (2)
===================================================
Tfragmentator: Fragmenting by AA_SideChains_FG
===================================================
------------------------------------------
Match: 1, Assigned Fragment: 8
------------------------------------------
Name: Ph Method: AA_SideChains_FG
Natoms: 11 Charge: 0 Mult: 1
12 C 0.178410 -0.790680 -0.515340
13 C -0.895760 -1.288470 -1.262580
14 C -1.134680 -2.667040 -1.312270
15 C -0.299410 -3.547820 -0.614730
16 C 0.774760 -3.050040 0.132500
17 C 1.013680 -1.671470 0.182200
18 H 1.842330 -1.287460 0.758630
19 H 1.419110 -3.729500 0.670600
20 H -0.483720 -4.611290 -0.653070
21 H -1.963330 -3.051050 -1.888700
22 H -1.540110 -0.609010 -1.800680
------------------------------------------
------------------------------------------
Match: 1, Assigned Fragment: 9
------------------------------------------
Name: CH2 Method: AA_SideChains_FG
Natoms: 3 Charge: 0 Mult: 1
9 C 0.436090 0.696200 -0.461750
10 H 1.512450 0.863150 -0.502770
11 H -0.018950 1.150700 -1.341790
------------------------------------------
```
Table {numref}`tab:essentialelements.delprocedures` lists the corresponding delete
procedure associated with each fragmentation method.
(tab:essentialelements.delprocedures)=
:::{table} Simple input keywords for Fragment detection and their corresponding deletion procedure
| Fragment detection Keyword | Fragment deletion Keyword |
|:---------------------------|:---------------------------|
| `Extlib` | `DELExtlib` |
| `Connectivity` | `DELConnectivity` |
| `Atomic` | `DELAtomic` |
| `FunctionalGroups` | `DELFunctionalGroups` |
| `NotAssigned` | |
| `Backbone` | `DELBackbone` |
| `SeqBackbone` | `DELSeqBackbone` |
| `AABackbone` | `DELAABackbone` |
| `Aminoacids` | `DELAminoacids` |
| `AASideChains` | `DELAASideChains` |
| `AASCFineGrained` | `DELAASCFineGrained` |
| `NABackbone` | `DELNABackbone` |
| `NABBFineGrained` | `DELNABBFineGrained` |
| `SEQNABackbone` | `DELSEQNABackbone` |
| `NucleoticAcid` | `DELNucleoticAcid` |
| `NASideChains` | `DELNASideChains` |
| `Solvents` | `DELSolvents` |
| `Water` | `DELWater` |
| `Extend` | |
| `FuseByAtoms` | |
:::
(sec:essentialelements.fragmentation.inputfile.keywordlist)=
## Options available in the `%frag` input block
Table {numref}`tab:essentialelements.keywordlist` contains a list of the options available in the `%frag` input block.
:::{tabularcolumns} \Y{0.2}\Y{0.1}\Y{0.15}\Y{0.55}
:::
(tab:essentialelements.keywordlist)=
:::{list-table} List of options in the `%frag` input block
:header-rows: 1
:class: longtable
* - Option
- Type
- Default
- Description
* - `Printlevel`
- Integer
- `1`
- Verbose output control for automated fragmentation.
* - `STOREFRAGS`
- Boolean
- `False`
- Stores assigned fragments in a `.fragments.xyz` file.
* - `DoInterFragBonds`
- Boolean
- `False`
- Automatically detects bonds between fragments for CoVaLED analysis.
* - `XZYFRAGLIB`
- String
- `None`
- Filenames used in `FragProc Extlib`.
* - `FragProc`
- See {numref}`tab:essentialelements.fragmentationprocedures`, and {numref}`tab:essentialelements.delprocedures`
- `ExtLib, Connectivity`
- Fragmentation procedures to be applied automatically.
* - `Usetopology`
- Boolean
- `False`
- Generate main geometry graph based on .prms file.
* - `TopolFile`
- String
- `""`
- Topology file name to been used when `Usetopology True`
* - `PrintInputFlags`
- Boolean
- `True`
- Writes a `%frag` block equivalet to current calculation fragments.
:::