(sec:essentialelements.fragmentation)= # Fragment Specification Atoms in a calculation can be grouped into specific *fragments*, which serve multiple purposes. Fragment definitions can be used to [assign Basis Sets and ECPs](#sec:essentialelements.basisset.fragments), organize output in the [population analysis section](#sec:spectroscopyproperties.pop.lsa), and enable features like [fragment constrain optimization](#sec:structurereactivity.geomopt.constrainedfragment) and [Rigid Body Optimization](#sec:structurereactivity.optimization.rigidbody). They are also used in [local energy decomposition](#sec:spectroscopyproperties.led) and [multi-level calculations](#sec:modelchemistries.mdci.multilevel). (sec:essentialelements.fragmentation.Inputdef)= ## Fragments defined on Input File There are three ways to assign atoms to fragments using the input file. The first method is to assign a specific atom to a specific fragment by placing `(n)` directly after the atomic symbol in the coordinates section. ```orca *xyz -2 2 Cu(1) 0.00 0.00 0.00 Cl(2) 2.25 0.00 0.00 Cl(2) -2.25 0.00 0.00 Cl(2) 0.00 2.25 0.00 Cl(2) 0.00 -2.25 0.00 * ``` In this example the fragment feature is used to divide the molecule into a "metal" and a "ligand" fragment and consequently the program will print the metal and ligand charges and populations. ```orca ---------------------------------------------- CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM) ---------------------------------------------- FRAGMENT 1 Cu 0.000000 0.000000 0.000000 FRAGMENT 2 Cl 2.250000 0.000000 0.000000 Cl -2.250000 0.000000 0.000000 Cl 0.000000 2.250000 0.000000 Cl 0.000000 -2.250000 0.000000 ... ---------------------------------------------- MULLIKEN FRAGMENT CHARGES AND SPIN POPULATIONS ---------------------------------------------- Fragment 0 : 0.752589 0.842580 Fragment 1 : -2.752589 0.157420 Sum of fragment charges : -2.0000000 Sum of fragment spin populations: 1.0000000 ... -------------------------------------------- LOEWDIN FRAGMENT CHARGES AND SPIN POPULATONS -------------------------------------------- Fragment 0 : 0.222028 0.851552 Fragment 1 : -2.222028 0.148448 ``` Alternatively, the `%coords` block can be used for fragment definitions in the same way—by placing `(n) ` directly after the atomic symbol. ```orca %coords CTyp xyz # the type of coordinates xyz or internal Charge -2 # the total charge of the molecule Mult 2 # the multiplicity = 2S+1 coords Cu(1) 0.00 0.00 0.00 Cl(2) 2.25 0.00 0.00 Cl(2) -2.25 0.00 0.00 Cl(2) 0.00 2.25 0.00 Cl(2) 0.00 -2.25 0.00 end end ``` :::{important} - In cases where all atoms are explicitly assigned to fragments, the fragment numbering must start at 1 and use consecutive integers. Non-consecutive or incorrect numbering may lead to errors. - If any atom is left unassigned (or explicitly assigned to fragment 0), it will be automatically assigned to a fragment using the fragmentation procedure described in [Automatic Fragmentation](#sec:essentialelements.fragmentation.inputfile) section. In such cases—where only a subset of atoms is manually assigned to fragments—it is not necessary to use consecutive fragment numbers. However, the highest fragment number specified in the input must be less than the total number of fragments generated by the combination of manual and automatic procedures. If this condition is not met, ORCA will automatically reorder all fragment numbers in ascending order, starting from 1. ::: Finally, a third way to define fragments consists of using a `Definition` inside the `%frag` block. In this scheme, the fragment number comes first, followed by a list of atoms (enumerated starting from 0) enclosed in curly brackets `{}` and finishing with `end`. Consecutive atoms can also be specified using the notation `initial_atom`:`final_atom`. ```orca *xyz -2 2 Cu 0.00 0.00 0.00 Cl 2.25 0.00 0.00 Cl -2.25 0.00 0.00 Cl 0.00 2.25 0.00 Cl 0.00 -2.25 0.00 * %frag Definition 1 {0} end # atom 0 for fragment 1 2 {1:4} end # atoms 1 to 4 for fragment 2 end end ``` ```orca *xyz -2 2 Cl 2.25 0.00 0.00 Cl -2.25 0.00 0.00 Cu 0.00 0.00 0.00 Cl 0.00 2.25 0.00 Cl 0.00 -2.25 0.00 * %frag Definition 1 {2} end # atom 2 for fragment 1 2 {0:1 3:4} end # atoms 0, 1, 3, and 4 for fragment 2 end end ``` :::{Note} - With the last option (`Definition`) the `%frag` block has to be written after the coordinate section. - `%frag Definition` also works with coordinates that are defined via an external file. ::: (sec:essentialelements.fragmentation.inputfile)= ## Automatic Fragmentation Starting with ORCA 6.1, a set of automatic fragmentation algorithms has been introduced to recognize and group atoms into fragments automatically. The automatic fragmentation procedure is triggered by including a `%frag` block in the input file or when only a subset of atoms has been manually assigned to fragments. Automatic fragmentation is performed using a set of procedures that can be selected by the user within the `%frag` block. Each procedure attempts to identify new fragments among atoms that have not yet been assigned. For example, consider a case below where the first 11 atoms belong to a propane molecule and the last three belong to a water molecule. The `Water` procedure in `FragProc` identifies the last three atoms as a water molecule and assigns them to the first fragment, while the `FunctionalGroups` procedure detects and assigns the CH3 and CH2 groups of propane as fragments 2 to 4. (sec:essentialelements.fragmentation.intlibex)= ```orca %frag FragProc Water, FunctionalGroups end * xyz 0 1 C 0.000000 1.270000 -0.260000 C 0.000000 -0.000000 0.580000 C 0.000000 -1.270000 -0.260000 H -0.890000 1.320000 -0.910000 H 0.890000 1.320000 -0.910000 H 0.000000 2.180000 0.370000 H 0.880000 -0.000000 1.250000 H -0.880000 -0.000000 1.250000 H 0.890000 -1.320000 -0.910000 H 0.000000 -2.180000 0.370000 H -0.880000 -1.300000 -0.910000 H -0.920000 0.850000 -2.430000 O -1.690000 0.830000 -3.000000 H -1.640000 1.630000 -3.510000 * ``` ```orca ---------------------------------------------- CARTESIAN COORDINATES OF FRAGMENTS (ANGSTROEM) ---------------------------------------------- FRAGMENT 1 H -0.920000 0.850000 -2.430000 O -1.690000 0.830000 -3.000000 H -1.640000 1.630000 -3.510000 FRAGMENT 2 C 0.000000 1.270000 -0.260000 H -0.890000 1.320000 -0.910000 H 0.890000 1.320000 -0.910000 H 0.000000 2.180000 0.370000 FRAGMENT 3 C 0.000000 -1.270000 -0.260000 H 0.890000 -1.320000 -0.910000 H 0.000000 -2.180000 0.370000 H -0.880000 -1.300000 -0.910000 FRAGMENT 4 C 0.000000 -0.000000 0.580000 H 0.880000 -0.000000 1.250000 H -0.880000 -0.000000 1.250000 ``` This automatic fragmentation yields the same result as the manual fragment definition shown below, without the need to inspect the geometry and assign fragments manually. ```orca %frag Definition 1 { 11:13} end # water 2 { 0 3:5} end # CH3 3 { 2 8:10} end # CH3 4 { 1 6:7} end # CH2 end end ``` :::{Note} - Any fragment defined in the [input file](#sec:essentialelements.fragmentation.Inputdef) take precedence over automatic assignments. - ORCA supports up to 10 procedures in `FragProc`, with the full list provided in {numref}`tab:essentialelements.fragmentationprocedures` and {numref}`tab:essentialelements.delprocedures`. - [Constrained Fragments](#sec:structurereactivity.geomopt.constrainedfragment) do not enable automatically the Automatic Fragmentation when there are atoms unassigned to fragments. However, automatic fragmentation can be activated by including a `%frag` block in the input file. ::: ORCA provides over 30 `FragProc` methods, which can be combined in a list to generate fragments for various purposes. Below are explanations and examples of the different procedures. (sec:essentialelements.fragmentation.inputfile.connectivity)= ### Automatic Fragmentation: Connectivity `FragProc Connectivity` groups atoms that are connected by chemical bonds, estimated based on atomic radii. In the example below, the first nine atoms belong to a dimethyl ether molecule, which is automatically detected and assigned to one fragment, while the last three atoms—belonging to a water molecule—are assigned to a second fragment. ```orca %frag PrintLevel 3 FragProc Connectivity end *xyz 0 1 O 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.380000 C 1.300000 0.000000 -0.460000 H -0.500000 0.870000 1.740000 H -0.500000 -0.870000 1.740000 H 1.000000 0.000000 1.740000 H 1.300000 0.000000 -1.530000 H 1.800000 0.870000 -0.100000 H 1.800000 -0.870000 -0.100000 H -1.840000 0.000000 -0.650000 O -2.440000 0.000000 0.080000 H -3.300000 0.000000 -0.300000 * ``` In this case, the output of the automatic fragmentation tool indicates that two fragments were assigned by the `Connectivity` procedure. ```orca ------------------------------------------ Assigned Fragment: 1 ------------------------------------------ Name: Connectivity:0 Method: Connectivity Natoms: 9 Charge: 0 Mult: 1 0 O 0.000000 0.000000 0.000000 1 C 0.000000 0.000000 1.380000 2 C 1.300000 0.000000 -0.460000 3 H -0.500000 0.870000 1.740000 4 H -0.500000 -0.870000 1.740000 5 H 1.000000 0.000000 1.740000 6 H 1.300000 0.000000 -1.530000 7 H 1.800000 0.870000 -0.100000 8 H 1.800000 -0.870000 -0.100000 ------------------------------------------ ------------------------------------------ Assigned Fragment: 2 ------------------------------------------ Name: Connectivity:1 Method: Connectivity Natoms: 3 Charge: 0 Mult: 1 9 H -1.840000 0.000000 -0.650000 10 O -2.440000 0.000000 0.080000 11 H -3.300000 0.000000 -0.300000 ------------------------------------------ ``` Each fragmentation procedure applies only to atoms that have not already been assigned to a fragment. Consequently, connectivity-based fragmentation will ignore any bonds to atoms that have been assigned in the input file or by a previous `FragProc`. For example, if in the previous case example the oxygen atom is manually assigned to fragment 1 in the input file, this manual assignment takes precedence over the `FragProc Connectivity`. As a result, four fragments are created: one fragment containing the manually assigned oxygen atom, two CH3 groups, and one water molecule. ```orca %frag PrintLevel 3 FragProc Connectivity end *xyz 0 1 O(1) 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.380000 C 1.300000 0.000000 -0.460000 H -0.500000 0.870000 1.740000 H -0.500000 -0.870000 1.740000 H 1.000000 0.000000 1.740000 H 1.300000 0.000000 -1.530000 H 1.800000 0.870000 -0.100000 H 1.800000 -0.870000 -0.100000 H -1.840000 0.000000 -0.650000 O -2.440000 0.000000 0.080000 H -3.300000 0.000000 -0.300000 * ``` The output of the automatic fragmentation tool indicates that the first fragment was assigned by the `Orca_Input` procedure, while the remaining three fragments were assigned by the `Connectivity` procedure. ```orca =================================================== Tfragmentator: Fragmenting by Orca_input =================================================== ------------------------------------------ Assigned Fragment: 1 ------------------------------------------ Name: User-defined:0 Method: Orca_input Natoms: 1 Charge: 0 Mult: 1 InputId: 1 0 O 0.000000 0.000000 0.000000 ------------------------------------------ =================================================== Tfragmentator: Fragmenting by Connectivity =================================================== ------------------------------------------ Assigned Fragment: 2 ------------------------------------------ Name: Connectivity:1 Method: Connectivity Natoms: 4 Charge: 0 Mult: 1 1 C 0.000000 0.000000 1.380000 3 H -0.500000 0.870000 1.740000 4 H -0.500000 -0.870000 1.740000 5 H 1.000000 0.000000 1.740000 ------------------------------------------ ------------------------------------------ Assigned Fragment: 3 ------------------------------------------ Name: Connectivity:2 Method: Connectivity Natoms: 4 Charge: 0 Mult: 1 2 C 1.300000 0.000000 -0.460000 6 H 1.300000 0.000000 -1.530000 7 H 1.800000 0.870000 -0.100000 8 H 1.800000 -0.870000 -0.100000 ------------------------------------------ ------------------------------------------ Assigned Fragment: 4 ------------------------------------------ Name: Connectivity:3 Method: Connectivity Natoms: 3 Charge: 0 Mult: 1 9 H -1.840000 0.000000 -0.650000 10 O -2.440000 0.000000 0.080000 11 H -3.300000 0.000000 -0.300000 ------------------------------------------ ``` (sec:essentialelements.fragmentation.inputfile.atnoas)= ### Automatic Fragmentation: Atomic and NotAssigned `FragProc Atomic` and `FragProc NotAssigned` are termination procedures. `FragProc Atomic` assigns each previously unassigned atom to its own individual fragment, while `FragProc NotAssigned` assigns all remaining unassigned atoms to a single fragment. Similar to other `FragProc` methods, the output of the automatic fragmentation tool indicates that the `Atomic` or `Not_assigned` procedure has been used to generate the corresponding fragments. ```orca %frag PrintLevel 3 FragProc Atomic end *xyz 0 1 O(1) 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.380000 C 1.300000 0.000000 -0.460000 H -0.500000 0.870000 1.740000 H -0.500000 -0.870000 1.740000 H 1.000000 0.000000 1.740000 H 1.300000 0.000000 -1.530000 H 1.800000 0.870000 -0.100000 H 1.800000 -0.870000 -0.100000 H -1.840000 0.000000 -0.650000 O -2.440000 0.000000 0.080000 H -3.300000 0.000000 -0.300000 * ``` ```orca =================================================== Tfragmentator: Fragmenting by Orca_input =================================================== ------------------------------------------ Assigned Fragment: 1 ------------------------------------------ Name: User-defined:0 Method: Orca_input Natoms: 1 Charge: 0 Mult: 1 InputId: 1 0 O 0.000000 0.000000 0.000000 ------------------------------------------ =================================================== Tfragmentator: Fragmenting by Atomic =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 2 ------------------------------------------ Name: C Method: Atomic Natoms: 1 Charge: 0 Mult: 1 1 C 0.000000 0.000000 1.380000 ------------------------------------------ ------------------------------------------ Match: 2, Assigned Fragment: 3 ------------------------------------------ Name: C Method: Atomic Natoms: 1 Charge: 0 Mult: 1 2 C 1.300000 0.000000 -0.460000 ------------------------------------------ ... ------------------------------------------ Match: 11, Assigned Fragment: 12 ------------------------------------------ Name: H Method: Atomic Natoms: 1 Charge: 0 Mult: 1 11 H -3.300000 0.000000 -0.300000 ------------------------------------------ ``` ```orca %frag PrintLevel 3 FragProc NotAssigned end *xyz 0 1 O(1) 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.380000 C 1.300000 0.000000 -0.460000 H -0.500000 0.870000 1.740000 H -0.500000 -0.870000 1.740000 H 1.000000 0.000000 1.740000 H 1.300000 0.000000 -1.530000 H 1.800000 0.870000 -0.100000 H 1.800000 -0.870000 -0.100000 H -1.840000 0.000000 -0.650000 O -2.440000 0.000000 0.080000 H -3.300000 0.000000 -0.300000 * ``` ```orca =================================================== Tfragmentator: Fragmenting by Orca_input =================================================== ------------------------------------------ Assigned Fragment: 1 ------------------------------------------ Name: User-defined:0 Method: Orca_input Natoms: 1 Charge: 0 Mult: 1 InputId: 1 0 O 0.000000 0.000000 0.000000 ------------------------------------------ =================================================== Tfragmentator: Setting not assigned atoms to a fragment =================================================== ------------------------------------------ Assigned Fragment: 1 ------------------------------------------ Name: Not Assigned Method: Not_assigned Natoms: 11 Charge: 0 Mult: 1 1 C 0.000000 0.000000 1.380000 2 C 1.300000 0.000000 -0.460000 3 H -0.500000 0.870000 1.740000 4 H -0.500000 -0.870000 1.740000 5 H 1.000000 0.000000 1.740000 6 H 1.300000 0.000000 -1.530000 7 H 1.800000 0.870000 -0.100000 8 H 1.800000 -0.870000 -0.100000 9 H -1.840000 0.000000 -0.650000 10 O -2.440000 0.000000 0.080000 11 H -3.300000 0.000000 -0.300000 ------------------------------------------ ``` :::{Note} - `FragProc NotAssigned` is always applied at the end of all `FragProc` procedures to ensure that no atoms remain without a fragment assignment. ::: (sec:essentialelements.fragmentation.inputfile.intlib)= ### Automatic Fragmentation: Internal Libraries ORCA includes a series of internal libraries containing definitions of many common molecular structures. In all cases, structure recognition is performed using a VF2 subgraph isomorphism algorithm applied to molecular graphs constructed from Cartesian coordinates. The available fragmentation procedures that make use of internal libraries are listed in {numref}`tab:essentialelements.fragmentationprocedures`. (tab:essentialelements.fragmentationprocedures)= :::{table} Simple input keywords for Fragment detection | Fragment detection Keyword | Description | |:---------------------------|:---------------------------| | `FunctionalGroups` | Contains a list of the most common organic functional groups. | | `Aminoacids` | Contains a list of all amino acids, including all common protonation states, but excluding zwitterionic forms. | | `AABackbone` | Contains fragment definitions for amino acid backbone detection. | | `Backbone` | Performs `AABackbone` followed by merging all fragments into a single protein backbone fragment. | | `SeqBackbone` | Similar to `AABackbone`, but peptide bonds are assigned as separate fragments. | | `AASidechains` | Contains a list of all amino acid side chains. | | `AASCFinegrained` | Contains a detailed list of organic functional groups within amino acid side chains. | | `NABackbone` | Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 3′ position. | | `SEQNABackbone` | Contains fragment definitions for DNA/RNA backbones; the phosphate group is assigned to the 5′ position. | | `NABBFinegrained` | Same as `NABackbone`, but further splits the phosphate group. | | `Nucleoticacid` | Contains a list of all nucleic acids. | | `NASidechains` | Contains a list of all nucleic acid side chains. | | `Solvents` | Contains definitions for common solvents: 1-octanol, n-hexane, cyclohexane, toluene, chlorobenzene, tetrahydrofuran, benzene, N,N-dimethylformamide, pyridine, dimethyl sulfoxide, acetone, ethanol, acetonitrile, methanol, chloroform, carbon tetrachloride, dichloromethane, ammonia, and water. | | `Water` | Contains definitions of water molecules, faster than `Solvents` if only fragment water molecules. | ::: Similar to other fragmentation procedures, listing multiple libraries in `FragProc` will apply the procedures consecutively. Using the first example listed in [Automatic Fragmentation](#sec:essentialelements.fragmentation.inputfile) ([input](#sec:essentialelements.fragmentation.intlibex)), and including `PrintLevel 3` in the `%frag` block to increase verbosity, the output will indicate the assignment of fragments by the `Water` and `Functional_groups` procedures. ```orca =================================================== Tfragmentator: Fragmenting by Water =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 1 ------------------------------------------ Name: WATER Method: Water Natoms: 3 Charge: 0 Mult: 1 11 H -0.920000 0.850000 -2.430000 12 O -1.690000 0.830000 -3.000000 13 H -1.640000 1.630000 -3.510000 ------------------------------------------ =================================================== Tfragmentator: Fragmenting by Functional_groups =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 2 ------------------------------------------ Name: CH3 Method: Functional_groups Natoms: 4 Charge: 0 Mult: 1 0 C 0.000000 1.270000 -0.260000 3 H -0.890000 1.320000 -0.910000 4 H 0.890000 1.320000 -0.910000 5 H 0.000000 2.180000 0.370000 ------------------------------------------ ------------------------------------------ Match: 2, Assigned Fragment: 3 ------------------------------------------ Name: CH3 Method: Functional_groups Natoms: 4 Charge: 0 Mult: 1 2 C 0.000000 -1.270000 -0.260000 8 H 0.890000 -1.320000 -0.910000 9 H 0.000000 -2.180000 0.370000 10 H -0.880000 -1.300000 -0.910000 ------------------------------------------ ------------------------------------------ Match: 1, Assigned Fragment: 4 ------------------------------------------ Name: CH2 Method: Functional_groups Natoms: 3 Charge: 0 Mult: 1 1 C 0.000000 -0.000000 0.580000 6 H 0.880000 -0.000000 1.250000 7 H -0.880000 -0.000000 1.250000 ------------------------------------------ ``` (sec:essentialelements.fragmentation.inputfile.externallib)= ### Automatic Fragmentation: External Libraries The automated fragmentator also allows users to supply `.xyz` files via the `XZYFRAGLIB` variable in `%frag` block, containing geometries that should be recognized as fragments. The `FragProc Extlib` procedure automatically converts each provided geometry into a molecular graph and then applies a VF2 subgraph isomorphism algorithm—just as is done with the internal libraries. The following example uses definitions of CH3O and CH3 fragments from the file `Mylib.xyz` to identify and generate fragments within the dimethyl ether geometry provided below: (sec:essentialelements.fragmentation.intextlib)= ```orca %frag PrintLevel 3 FragProc Extlib XZYFRAGLIB "Mylib.xyz" end *xyz 0 1 O 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.380000 C 1.300000 0.000000 -0.460000 H -0.500000 0.870000 1.740000 H -0.500000 -0.870000 1.740000 H 1.000000 0.000000 1.740000 H 1.300000 0.000000 -1.530000 H 1.800000 0.870000 -0.100000 H 1.800000 -0.870000 -0.100000 * ``` Where `Mylib.xyz` contains definitions for methyl and methoxy groups. ```orca 5 CHARGE 0 MULT 1 NAME CH3O O 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.380000 H 1.008807 0.000000 1.736663 H -0.328435 0.953845 1.736667 H -0.794950 -0.621083 1.736667 4 CHARGE 0 MULT 1 NAME CH3 C 0.000000 0.000000 0.000000 H 0.000000 0.000000 1.070000 H 1.008807 0.000000 -0.356663 H -0.328435 -0.953845 -0.356667 ``` The result is the assignment of atoms 0, 1, 3, 4, and 5 to a CH3O fragment, with the remaining atoms assigned to a CH3 by the `Ext_lib` procedure. ```orca =================================================== Tfragmentator: Fragmenting by Ext_lib =================================================== **** **** There are 2 Ref structures found in file Mylib.xyz **** ------------------------------------------ Match: 1, Assigned Fragment: 1 ------------------------------------------ Name: CH3O Method: Ext_lib Natoms: 5 Charge: 0 Mult: 1 0 O 0.000000 0.000000 0.000000 1 C 0.000000 0.000000 1.380000 3 H -0.500000 0.870000 1.740000 4 H -0.500000 -0.870000 1.740000 5 H 1.000000 0.000000 1.740000 ------------------------------------------ ------------------------------------------ Match: 1, Assigned Fragment: 2 ------------------------------------------ Name: CH3 Method: Ext_lib Natoms: 4 Charge: 0 Mult: 1 2 C 1.300000 0.000000 -0.460000 6 H 1.300000 0.000000 -1.530000 7 H 1.800000 0.870000 -0.100000 8 H 1.800000 -0.870000 -0.100000 ------------------------------------------ ``` :::{Note} - `XZYFRAGLIB` allows the inclusion of up to 10 files. Each file may contain multiple fragment definitions; however, each definition must consist of a single, connected molecule—unconnected structures within a single definition are not allowed. - The format of `Mylib.xyz` follows the standard XYZ file structure, but optionally supports three identifiers: `CHARGE`, `MULT`, and `NAME`, with `NAME` expected to appear last. These identifiers are printed when a fragment is recognized, though they are not currently used in any calculations. - Subsequent modifications to fragments by other methods (see [Fusebyatoms](#sec:essentialelements.fragmentation.inputfile.fuse) and [Extend](#sec:essentialelements.fragmentation.inputfile.extend) below) do not affect the fragment's CHARGE or MULT values in the identifiers. ::: :::{important} - ORCA’s VF2 subgraph isomorphism algorithm is based solely on atomic connectivity; it does not consider stereochemistry or bond orders. - Fragment assignment by `FragProc Extlib` follows the order of the files listed in `XZYFRAGLIB` and the order of fragment definitions within each file. Since an atom is excluded from subsequent matching once it has been assigned to a fragment, the order in which fragments are defined in the library is critically important. For example, in the previous [case](#sec:essentialelements.fragmentation.intextlib), if CH3 is defined before CH3O in `Mylib.xyz`, both CH3 groups in the [input file](#sec:essentialelements.fragmentation.intextlib) will be matched first. This prevents CH3O from being recognized later, even if it is present in the system. - Atom assignment to fragments follows the order in which atoms appear in the geometry. In the [previous example](#sec:essentialelements.fragmentation.intextlib), the CH3O fragment could be assigned using either the C(1)-O or C(2)-O bond. However, since C(1) appears first in the geometry, it is selected for the CH3O fragment. ::: (sec:essentialelements.fragmentation.inputfile.extend)= ### Automatic Fragmentation: Extend Many fragments in the internal libraries represent incomplete molecular structures. Therefore, in some cases, it is necessary to add additional hydrogen atoms—or, in certain situations, a hydroxyl group—to complete the system. The `FragProc Extend` addresses this by identifying oxygen atoms bonded to carbon atoms in previously assigned fragments, as well as hydrogen atoms bonded to carbon, nitrogen, or oxygen. It then extends the corresponding fragments to include these atoms. Consider the example below, which consists of a zwitterionic glycine molecule fragmented using an external library that defines a C–C–N fragment. In this case, `FragProc Extend` can be applied to add the missing oxygen and hydrogen atoms, thereby completing the molecular structure. ```orca %frag PrintLevel 3 FragProc Extlib, Extend XZYFRAGLIB "Mylib.xyz" end *xyz 0 1 N 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.460000 C 1.403962 0.000000 2.015868 O 1.837767 0.962613 2.627089 O 2.161436 -0.947142 1.883370 H 0.423874 -0.764689 -0.525339 H -0.538292 -0.884247 1.831954 H -0.538292 0.884247 1.831954 H -0.970478 0.016940 -0.313504 H 0.499909 0.831989 -0.313504 * ``` Where `Mylib.xyz` in this case is: ```orca 3 Name CCN C 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.500000 N 1.376503 0.000000 -0.486661 ``` The result is an initial assignment of the C–C–N fragment, followed by an extension of the fragment to include two oxygen atoms and five hydrogen atoms. ```orca ------------------------------------------ Match: 1, Assigned Fragment: 1 ------------------------------------------ Name: CCN Method: Ext_lib Natoms: 3 Charge: 0 Mult: 1 0 N 0.000000 0.000000 0.000000 1 C 0.000000 0.000000 1.460000 2 C 1.403962 0.000000 2.015868 ------------------------------------------ =================================================== Tfragmentator: Extending Fragments =================================================== Extending Fragment 0 with O (3) Extending Fragment 0 with O (4) Extending Fragment 0 with H (6) Extending Fragment 0 with H (7) Extending Fragment 0 with H (5) Extending Fragment 0 with H (8) Extending Fragment 0 with H (9) ``` :::{important} - `FragProc Extend` attempts to extend any previously assigned fragment. - `FragProc Extend` does not modify the `Charge` and `Mult` identifiers of fragments. ::: (sec:essentialelements.fragmentation.inputfile.fuse)= ### Automatic Fragmentation: Fusebyatoms When multiple fragmentation procedures are used in combination, it may be necessary to merge two previously assigned fragments into one. This can be achieved using the `FragProc Fusebyatoms` procedure, which identifies atom pairs that should belong to the same fragment, as specified in `FuseAtomPairs`. In the example below, the objective is to fragment a propane molecule into a CH3 group and a CH3–CH2 fragment. One way to achieve this is by first applying `FragProc FunctionalGroups`, which fragments the molecule into two CH3 groups and one CH2 group. The CH2 group can then be merged with one of the CH3 groups. This is done by specifying the carbon atoms of the CH2 and one CH3 group in the `FuseAtomPairs` directive: `FuseAtomPairs {0 1} end`. ```orca %frag printlevel 3 FragProc FunctionalGroups, Fusebyatoms FuseAtomPairs {0 1} end end *xyz 0 1 C 0.000000 1.270000 -0.260000 C 0.000000 -0.000000 0.580000 C 0.000000 -1.270000 -0.260000 H -0.890000 1.320000 -0.910000 H 0.890000 1.320000 -0.910000 H 0.000000 2.180000 0.370000 H 0.880000 -0.000000 1.250000 H -0.880000 -0.000000 1.250000 H 0.890000 -1.320000 -0.910000 H 0.000000 -2.180000 0.370000 H -0.880000 -1.300000 -0.910000 * ``` The output from the fragmentator first reports the initial fragmentation performed by the `Functional_groups` library, followed by a message indicating the subsequent fusion of fragments as specified by the `FuseAtomPairs` directive. ```orca =================================================== Tfragmentator: Fragmenting by Functional_groups =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 1 ------------------------------------------ Name: CH3 Method: Functional_groups Natoms: 4 Charge: 0 Mult: 1 0 C 0.000000 1.270000 -0.260000 3 H -0.890000 1.320000 -0.910000 4 H 0.890000 1.320000 -0.910000 5 H 0.000000 2.180000 0.370000 ------------------------------------------ ------------------------------------------ Match: 2, Assigned Fragment: 2 ------------------------------------------ Name: CH3 Method: Functional_groups Natoms: 4 Charge: 0 Mult: 1 2 C 0.000000 -1.270000 -0.260000 8 H 0.890000 -1.320000 -0.910000 9 H 0.000000 -2.180000 0.370000 10 H -0.880000 -1.300000 -0.910000 ------------------------------------------ ------------------------------------------ Match: 1, Assigned Fragment: 3 ------------------------------------------ Name: CH2 Method: Functional_groups Natoms: 3 Charge: 0 Mult: 1 1 C 0.000000 -0.000000 0.580000 6 H 0.880000 -0.000000 1.250000 7 H -0.880000 -0.000000 1.250000 ------------------------------------------ =================================================== Tfragmentator: Fusing Fragments =================================================== Fusing Fragment 0 (atom 0) with Fragment 2 (atom 1) ``` (sec:essentialelements.fragmentation.inputfile.delete)= ### Automatic Fragmentation: Delete and advanced fragmentation workflows Almost all fragmentation schemes have an associated delete procedure, which removes the fragments generated by that specific scheme. This enables the construction of advanced fragmentation workflows, where a procedure can temporarily "protect" certain atoms from being fragmented by subsequent methods. These protected fragments can later be deleted, allowing the atoms to be reprocessed using a different fragmentation approach. Consider the system [below](#sec:essentialelements.fragmentation.intdelaa), which consists of both a phenylalanine molecule and a benzene molecule. The goal is to fragment the phenylalanine into its backbone, one CH2 group, and the phenyl ring, while fragmenting the benzene into six individual CH fragments. On one hand, by combining the three fragmentation procedures — `FragProc AABackbone, Extend, AASCFinegrained` — the desired fragmentation of phenylalanine can be achieved. The `AABackbone` and `Extend` options identify the aminoacid backbone, while `AASCFinegrained` separates the CH2 group and the phenyl ring into distinct fragments. Meanwhile, the benzene molecule can be fragmented into six individual CH fragments using `FragProc Extlib`, which relies on an external library defining the CH fragment. However, these two fragmentation schemes are incompatible. When `FragProc AASCFinegrained` is applied to the entire system, it fragments not only the phenylalanine side chain but also breaks the benzene ring into a phenyl group and a hydrogen atom. Conversely, if `FragProc Extlib` is applied first to benzene, it may inadvertently fragment the phenylalanine residue, leading to undesired structural splits. A viable approach to achieve the desired fragmentation involves first identifying the amino acid using `FragProc Aminoacids`. This procedure assigns all atoms that belong to amino acid fragments, thereby protecting them from reassignment by subsequent fragmentation steps. The benzene molecule can then be fragmented using an external library via `FragProc Extlib`. Finally, the phenylalanine fragment is removed using `FragProc DelAminoacids`, allowing it to be re-fragmented as needed. All these operations are executed by simply concatenating the procedures as follows: `FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained` (sec:essentialelements.fragmentation.intdelaa)= ```orca %frag PrintLevel 3 FragProc Aminoacids, Extlib, DelAminoacids, AABackbone, Extend, AASCFinegrained XZYFRAGLIB "Mylib.xyz" STOREFRAGS true end *xyz 0 1 O 1.300780 3.093320 1.571260 C 0.407700 2.826000 0.770390 O -0.101890 3.606620 -0.030470 C -0.115350 1.396690 0.770390 N -1.564350 1.396690 0.770390 H -1.900780 1.872730 1.595200 H 0.242580 0.884560 1.663540 H -1.901490 0.444630 0.770390 H -1.901016 1.873069 -0.054118 C 0.436090 0.696200 -0.461750 H 1.512450 0.863150 -0.502770 H -0.018950 1.150700 -1.341790 C 0.178410 -0.790680 -0.515340 C -0.895760 -1.288470 -1.262580 C -1.134680 -2.667040 -1.312270 C -0.299410 -3.547820 -0.614730 C 0.774760 -3.050040 0.132500 C 1.013680 -1.671470 0.182200 H 1.842330 -1.287460 0.758630 H 1.419110 -3.729500 0.670600 H -0.483720 -4.611290 -0.653070 H -1.963330 -3.051050 -1.888700 H -1.540110 -0.609010 -1.800680 C 1.181325 -3.732146 -2.620584 C 2.016592 -4.612923 -1.923046 C 3.090772 -4.115131 -1.175824 C 3.329684 -2.736563 -1.126139 C 2.494417 -1.855785 -1.823677 C 1.420237 -2.353577 -2.570900 H 0.770518 -1.668458 -3.113485 H 0.422252 -4.288369 -3.129823 H 1.830753 -5.685253 -1.961693 H 3.740491 -4.800250 -0.633239 H 4.165243 -2.349352 -0.544907 H 2.680256 -0.783455 -1.785030 * ``` Where "Mylib.xyz" in this case is: ```orca 2 NAME CH C 0.00 0.00 0.00 H 1.08 0.00 0.00 ``` The complete fragmentation sequence follows the same structure as in previous examples. In this case, the message `Deleted 1 fragments` is printed after the fragment assignment performed by `Ext_lib`, indicating that a fragment previously defined by `FragProc Aminoacids` has been successfully removed. ```orca =================================================== Tfragmentator: Fragmenting by Amino_Acids =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 1 ------------------------------------------ Name: CPHE Method: Amino_Acids Natoms: 21 Charge: -1 Mult: 1 0 O 1.300780 3.093320 1.571260 1 C 0.407700 2.826000 0.770390 2 O -0.101890 3.606620 -0.030470 3 C -0.115350 1.396690 0.770390 4 N -1.564350 1.396690 0.770390 5 H -1.900780 1.872730 1.595200 6 H 0.242580 0.884560 1.663540 9 C 0.436090 0.696200 -0.461750 10 H 1.512450 0.863150 -0.502770 11 H -0.018950 1.150700 -1.341790 12 C 0.178410 -0.790680 -0.515340 13 C -0.895760 -1.288470 -1.262580 14 C -1.134680 -2.667040 -1.312270 15 C -0.299410 -3.547820 -0.614730 16 C 0.774760 -3.050040 0.132500 17 C 1.013680 -1.671470 0.182200 18 H 1.842330 -1.287460 0.758630 19 H 1.419110 -3.729500 0.670600 20 H -0.483720 -4.611290 -0.653070 21 H -1.963330 -3.051050 -1.888700 22 H -1.540110 -0.609010 -1.800680 ------------------------------------------ =================================================== Tfragmentator: Fragmenting by Ext_lib =================================================== **** **** There are 1 Ref structures found in file Mylib.xyz **** ------------------------------------------ Match: 1, Assigned Fragment: 2 ------------------------------------------ Name: CH Method: Ext_lib Natoms: 2 Charge: 0 Mult: 1 23 C 1.181325 -3.732146 -2.620584 30 H 0.422252 -4.288369 -3.129823 ------------------------------------------ ------------------------------------------ Match: 2, Assigned Fragment: 3 ------------------------------------------ Name: CH Method: Ext_lib Natoms: 2 Charge: 0 Mult: 1 24 C 2.016592 -4.612923 -1.923046 31 H 1.830753 -5.685253 -1.961693 ------------------------------------------ ... ------------------------------------------ Match: 6, Assigned Fragment: 7 ------------------------------------------ Name: CH Method: Ext_lib Natoms: 2 Charge: 0 Mult: 1 28 C 1.420237 -2.353577 -2.570900 29 H 0.770518 -1.668458 -3.113485 ------------------------------------------ =================================================== Tfragmentator: Deleting Fragments (Amino_Acids) =================================================== Deleted 1 fragments =================================================== Tfragmentator: Fragmenting by Backbone =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 7 ------------------------------------------ Name: CO-NH3-CH Method: AA_Backbone Natoms: 8 Charge: 0 Mult: 0 0 O 1.300780 3.093320 1.571260 1 C 0.407700 2.826000 0.770390 3 C -0.115350 1.396690 0.770390 4 N -1.564350 1.396690 0.770390 5 H -1.900780 1.872730 1.595200 6 H 0.242580 0.884560 1.663540 7 H -1.901490 0.444630 0.770390 8 H -1.901016 1.873069 -0.054118 ------------------------------------------ =================================================== Tfragmentator: Extending Fragments =================================================== Extending Fragment 6 with O (2) =================================================== Tfragmentator: Fragmenting by AA_SideChains_FG =================================================== ------------------------------------------ Match: 1, Assigned Fragment: 8 ------------------------------------------ Name: Ph Method: AA_SideChains_FG Natoms: 11 Charge: 0 Mult: 1 12 C 0.178410 -0.790680 -0.515340 13 C -0.895760 -1.288470 -1.262580 14 C -1.134680 -2.667040 -1.312270 15 C -0.299410 -3.547820 -0.614730 16 C 0.774760 -3.050040 0.132500 17 C 1.013680 -1.671470 0.182200 18 H 1.842330 -1.287460 0.758630 19 H 1.419110 -3.729500 0.670600 20 H -0.483720 -4.611290 -0.653070 21 H -1.963330 -3.051050 -1.888700 22 H -1.540110 -0.609010 -1.800680 ------------------------------------------ ------------------------------------------ Match: 1, Assigned Fragment: 9 ------------------------------------------ Name: CH2 Method: AA_SideChains_FG Natoms: 3 Charge: 0 Mult: 1 9 C 0.436090 0.696200 -0.461750 10 H 1.512450 0.863150 -0.502770 11 H -0.018950 1.150700 -1.341790 ------------------------------------------ ``` Table {numref}`tab:essentialelements.delprocedures` lists the corresponding delete procedure associated with each fragmentation method. (tab:essentialelements.delprocedures)= :::{table} Simple input keywords for Fragment detection and their corresponding deletion procedure | Fragment detection Keyword | Fragment deletion Keyword | |:---------------------------|:---------------------------| | `Extlib` | `DELExtlib` | | `Connectivity` | `DELConnectivity` | | `Atomic` | `DELAtomic` | | `FunctionalGroups` | `DELFunctionalGroups` | | `NotAssigned` | | | `Backbone` | `DELBackbone` | | `SeqBackbone` | `DELSeqBackbone` | | `AABackbone` | `DELAABackbone` | | `Aminoacids` | `DELAminoacids` | | `AASideChains` | `DELAASideChains` | | `AASCFineGrained` | `DELAASCFineGrained` | | `NABackbone` | `DELNABackbone` | | `NABBFineGrained` | `DELNABBFineGrained` | | `SEQNABackbone` | `DELSEQNABackbone` | | `NucleoticAcid` | `DELNucleoticAcid` | | `NASideChains` | `DELNASideChains` | | `Solvents` | `DELSolvents` | | `Water` | `DELWater` | | `Extend` | | | `FuseByAtoms` | | ::: (sec:essentialelements.fragmentation.inputfile.keywordlist)= ## Options available in the `%frag` input block Table {numref}`tab:essentialelements.keywordlist` contains a list of the options available in the `%frag` input block. :::{tabularcolumns} \Y{0.2}\Y{0.1}\Y{0.15}\Y{0.55} ::: (tab:essentialelements.keywordlist)= :::{list-table} List of options in the `%frag` input block :header-rows: 1 :class: longtable * - Option - Type - Default - Description * - `Printlevel` - Integer - `1` - Verbose output control for automated fragmentation. * - `STOREFRAGS` - Boolean - `False` - Stores assigned fragments in a `.fragments.xyz` file. * - `DoInterFragBonds` - Boolean - `False` - Automatically detects bonds between fragments for CoVaLED analysis. * - `XZYFRAGLIB` - String - `None` - Filenames used in `FragProc Extlib`. * - `FragProc` - See {numref}`tab:essentialelements.fragmentationprocedures`, and {numref}`tab:essentialelements.delprocedures` - `ExtLib, Connectivity` - Fragmentation procedures to be applied automatically. * - `Usetopology` - Boolean - `False` - Generate main geometry graph based on .prms file. * - `TopolFile` - String - `""` - Topology file name to been used when `Usetopology True` * - `PrintInputFlags` - Boolean - `True` - Writes a `%frag` block equivalet to current calculation fragments. :::