opi.input.structures

Modules that hold Python objects representing chemical structures (i.e. atom types and coordinates) and structure files supported by ORCA.

Submodules

Classes

Atom

Class to model singular atom in a structure.

EmbeddingPotential

Class to model embedding potential

GhostAtom

Class to model ghost atom.

PointCharge

Class to model point charge.

Coordinates

Coordinates of an atom in Cartesian space.

Properties

Class to represent structure properties (e.g., total or relative energies).

Structure

Class to model internal structure for ORCA calculations.

BaseStructureFile

Class to model structure file.

GzmtFile

Class to model .gzmt structure file.

PdbFile

Class to model .pdb structure file.

XyzFile

Class to model .xyz structure file.

Package Contents

class opi.input.structures.Atom(element, *args, **kwargs)

Bases: _CoordLineWithElementBase

Class to model singular atom in a structure.

Parameters:
element

Specify what element the atom is

Type:

Element

Return type:

opi.utils.element.Element

coordinates

Define coordinates in Cartesian space

Type:

Coordinates

Return type:

opi.input.structures.coordinates.Coordinates

fragment_id

Define id of fragment

Type:

int

Return type:

int | None

nuclear_charge

Define nuclear charge of atom

Type:

float | int

Return type:

float | None

mass

Define mass of atom

Type:

float | int

Return type:

float | None

append_str

Append an arbitrary string to coordinate line if needed

Type:

str

_element: opi.utils.element.Element
property element: opi.utils.element.Element
Return type:

opi.utils.element.Element

class opi.input.structures.EmbeddingPotential(charge, *args, **kwargs)

Bases: _CoordLineWithSymbolAndChargeBase

Class to model embedding potential

Parameters:
  • charge (float | int)

  • args (Any)

  • kwargs (Any)

_fmt_element()
Return type:

str

class opi.input.structures.GhostAtom(element, *args, **kwargs)

Bases: Atom

Class to model ghost atom.

Parameters:
_fmt_element()
Return type:

str

class opi.input.structures.PointCharge(*args, **kwargs)

Bases: _CoordLineWithSymbolAndChargeBase

Class to model point charge.

Parameters:
  • args (Any)

  • kwargs (Any)

_element = 'Q'
class opi.input.structures.Coordinates(coordinates)

Coordinates of an atom in Cartesian space.

Can be initialized using a tuple, numpy arrays or another instance of Coordinates.

Parameters:

coordinates (Coordinates | tuple[int | float, int | float, int | float] | npt.NDArray[np.float64])

coordinates
Type:

Column vector with three rows.

Return type:

numpy.typing.NDArray[numpy.float64]

_coordinates: numpy.typing.NDArray[numpy.float64]
property coordinates: numpy.typing.NDArray[numpy.float64]
Return type:

numpy.typing.NDArray[numpy.float64]

to_list()

Returns coordinates as list

Return type:

list[float]

property x: numpy.float64

x-coordinate

Return type:

numpy.float64

property y: numpy.float64

y-coordinate

Return type:

numpy.float64

property z: numpy.float64

z-coordinate

Return type:

numpy.float64

class opi.input.structures.Properties(structure_id=None, energy_total=None, energy_relative=None)

Class to represent structure properties (e.g., total or relative energies). Currently, properties can only be read from XYZ files created with GOAT or DOCKER.

Parameters:
  • structure_id (int | None)

  • energy_total (float | None)

  • energy_relative (float | None)

structure_id

Number of the structure, within the XYZ file, from which the properties are.

Type:

int | None, default = None

energy_total

Energy of a structure.

Type:

float | None, default = None

energy_relative

Relative energy of a structure (relative to any).

Type:

float | None, default = None

structure_id: int | None = None
energy_total: float | None = None
energy_relative: float | None = None
classmethod from_xyz(xyz_file, mode='goat')

Function for reading properties from the comment line of a single structure from a (multi-)XYZ file and returning a Properties object.

Parameters:
  • xyz_file (Path | str | PathLike[str]) -- Name or path to XYZ file.

  • mode (Literal["goat", "docker"], default = "goat") -- Define how the comment line should be processed, e.g, it is the comment line from a DOCKER or GOAT run.

Returns:

Properties object extracted from file or None if nothing could be extracted.

Return type:

Properties | None

Raises:
  • FileNotFoundError -- If the XYZ file cannot be found.

  • ValueError -- If there is a problem with parsing the XYZ file.

  • EOFError -- If the file is empty.

classmethod from_trj_xyz(trj_file, /, *, mode='goat', comment_symbols=None, n_struc_limit=None)

Function for reading multi-XYZ file and returning a list of Properties.

Parameters:
  • trj_file (Path | str | PathLike[str]) -- Name or path to XYZ file with one or multiple structure(s).

  • mode (Literal["goat", "docker"], default = "goat") -- Define how the comment line should be processed, e.g, it is the comment line from a DOCKER or GOAT run.

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the XYZ data. User comments have to start with the given symbol, fill a whole line, and come before the actual XYZ data.

  • n_struc_limit (int | None, default: None) -- If >0, only read the first n structures.

Returns:

Properties object extracted from file.

Return type:

list[Properties]

Raises:
  • FileNotFoundError -- If the XYZ file cannot be found.

  • ValueError -- If there is a problem with parsing the XYZ file.

  • EOFError -- If the file is empty.

classmethod from_xyz_block(xyz_string, mode='goat')

Function for reading a single XYZ file from a string and returning a Properties object.

Parameters:
  • xyz_string (str) -- String that contains XYZ file data.

  • mode (Literal["goat", "docker"], default = "goat") -- Define how the comment line should be processed, e.g, it is the comment line from a DOCKER or GOAT run.

Returns:

The Properties object extracted from string.

Return type:

Properties

Raises:
  • ValueError -- If there is a problem with parsing the XYZ string or if n_struc_limit is negative or 0.

  • EOFError -- If the string is empty.

classmethod from_trj_xyz_block(trj_string, /, *, mode='goat', comment_symbols=None, n_struc_limit=None)

Function for reading trajectory data from string and returning a list of Properties.

Parameters:
  • trj_string (Path | str | PathLike[str]) -- String that contains one or multiple XYZ blocks (trajectory data).

  • mode (Literal["goat", "docker"], default = "goat") -- Define how the comment line should be processed, e.g, it is the comment line from a DOCKER or GOAT run.

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the XYZ data. User comments have to start with the given symbol, fill a whole line, and come before the actual XYZ data.

  • n_struc_limit (int | None, default: None) -- If >0, only read the first n structures.

Returns:

list[Properties]

Return type:

List of Properties extracted from string.

Raises:
  • ValueError -- If there is a problem with parsing the XYZ data or if n_struc_limit is <= 0.

  • EOFError -- If the string is empty.

classmethod from_xyz_buffer(xyz_lines, /, *, comment_symbols=None, mode='goat')

Function for reading from the comment line of a XYZ file from a buffer and converting it to a Properties object.

Parameters:
  • xyz_lines (TrackingTextIO) -- A buffer that contains XYZ file data.

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the XYZ data. User comments have to start with the given symbol, fill a whole line, and come before the actual XYZ data.

  • mode (Literal["goat", "docker"], default = "goat") -- Define how the comment line should be processed, e.g, it is the comment line from a DOCKER or GOAT run.

Returns:

The Properties object extracted from the buffer.

Return type:

Properties

Raises:
  • ValueError -- When no valid properties can be read from the input buffer or the corresponding structure is incomplete.

  • EOFError -- When no data is in the buffer.

classmethod docker_energies(line)

Function for reading DOCKER energies from comment line of a DOCKER XYZ file and return them in Properties object.

Parameters:

line (str)

Return type:

Properties

classmethod goat_energies(line)

Function for reading GOAT energies from comment line of a GOAT XYZ file and return them in Properties object.

Parameters:

line (str)

Return type:

Properties

classmethod _iter_xyz_structures(tracked, comment_symbols, mode, n_struc_limit)

Yield properties from the buffer until exhausted or the limit is reached.

Parameters:
  • tracked (TrackingTextIO) -- A buffer that contains XYZ file data.

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the XYZ data. User comments have to start with the given symbol, fill a whole line, and come before the actual XYZ data.

  • mode (Literal["goat", "docker"], default = "goat") -- Define how the comment line should be processed, e.g, it is the comment line from a DOCKER or GOAT run.

  • n_struc_limit (int | None, default: None) -- If >0, only read the first n structures.

Returns:

Iterator of Properties object extracted from the buffer.

Return type:

Iterator["Properties"]

Raises:

ValueError -- If n_struc_limit is negative or zero.

class opi.input.structures.Structure(atoms, charge=0, multiplicity=1, origin=None)

Class to model internal structure for ORCA calculations.

Parameters:
atoms

Atoms in the molecule

Type:

list[Atom | EmbeddingPotential | GhostAtom | PointCharge]

Return type:

list[opi.input.structures.atom.Atom | opi.input.structures.atom.EmbeddingPotential | opi.input.structures.atom.GhostAtom | opi.input.structures.atom.PointCharge]

charge

Charge of structure

Type:

int

Return type:

int

multiplicity

Multiplicity of structure

Type:

int

Return type:

int

origin

Origin of the molecule, usually path to a file or some identifier

Type:

Path | str

_atoms: list[opi.input.structures.atom.Atom | opi.input.structures.atom.EmbeddingPotential | opi.input.structures.atom.GhostAtom | opi.input.structures.atom.PointCharge] = []
property atoms: list[opi.input.structures.atom.Atom | opi.input.structures.atom.EmbeddingPotential | opi.input.structures.atom.GhostAtom | opi.input.structures.atom.PointCharge]
Return type:

list[opi.input.structures.atom.Atom | opi.input.structures.atom.EmbeddingPotential | opi.input.structures.atom.GhostAtom | opi.input.structures.atom.PointCharge]

_charge: int
property charge: int
Return type:

int

_multiplicity: int
property multiplicity: int
Return type:

int

origin: Any | None = None
property nelectrons: int

Returns the number of electrons based on the cardinal numbers of atoms in the structure and the overall molecular charge. Note that the number of electrons returned by this function can be negative and should be checked!

Returns:

nelectrons -- Returns the number of electrons for the structure. Can be negative!

Return type:

int

property nelec_is_odd: bool

Returns a boolean indicating if the number of electrons is odd. Does not check for negative electrons.

Return type:

bool

property nelec_is_even: bool

Returns a boolean indicating if the number of electrons is even. Does not check for negative electrons.

Return type:

bool

property multiplicity_is_odd: bool

Returns a boolean indicating if the multiplicity is odd.

Return type:

bool

property multiplicity_is_even: bool

Returns a boolean indicating if the multiplicity is even.

Return type:

bool

property nelec_and_multiplicity_even: bool

Returns a boolean indicating if the number of electrons and the multiplicity are even.

Return type:

bool

property nelec_and_multiplicity_odd: bool

Returns a boolean indicating if the number of electrons and the multiplicity are odd.

Return type:

bool

property multiplicity_is_possible: bool

Returns a boolean indicating if the multiplicity can be realized with the number of electrons.

Return type:

bool

set_ls_multiplicity()

Sets multiplicity to the lowest possible multiplicity based on the number of electrons (multiplicity will either be set to 1 or 2).

Return type:

None

classmethod combine_molecules(structure1, structure2)

function to combine two objects of Molecule class

Parameters:
  • structure1 (Structure) -- Define first structure to be combined

  • structure2 (Structure) -- Define second structure to be combined

Returns:

Structure

Return type:

Combined structure

format_orca()

Returns string representation of Molecule Iteratively calls Atom.format_orca() and compiles it all together to create string representation of Molecule

Returns:

String representation of Molecule

Return type:

str

add_atom(new_atom, position=None)

Adds Atom object at specified index. If index is None, Atom object appended to end of list

Parameters:
  • new_atom (Atom) -- Atom model to be added to self.atoms

  • position (int | None, default = None) -- position at which Atom is supposed to be added , default value None

Raises:

ValueError -- if index is an invalid value

Return type:

None

delete_atom(index)

Deletes Atom at specified index

Parameters:

index (int) -- specifies index of Atom to be deleted

Raises:

ValueError -- if index is invalid value

Return type:

None

replace_atom(new_atom, index)

replaces Atom at index with a new Atom object

Parameters:
Raises:

ValueError -- if index is invalid value

Return type:

None

extract_substructure(indexes)

returns Molecule object that is a sub-molecule specified by indexes.

Parameters:

indexes (list[int]) -- specifies indexes of Atom objects to be extracted

Returns:

Molecule

Return type:

new Molecule object

update_coordinates(array)

Validates dimensions of array first replace all coordinates of all atoms in Molecule object. Calls Atom.update_coordinates() iteratively , replacing the Atom.coordinates with rows from array argument

Parameters:

array (npt.NDArray[np.float64]) -- new coordinates

Raises:

ValueError -- in the case of wrong dimensions

Return type:

None

to_xyz_block()

Function to generate XYZ block

Return type:

str

classmethod from_xyz(xyzfile, /, *, charge=0, multiplicity=1)

Function for reading a xyz file and converting it to a molecular Structure

Parameters:
  • xyzfile (Path | str | PathLike[str]) -- Name or path to xyz file

  • charge (int, default: 0) -- Charge of the molecule

  • multiplicity (int, default: 1) -- Electron spin multiplicity of the molecule

Raises:
  • FileNotFoundError -- If the XYZ file cannot be found

  • ValueError -- If there is a problem with parsing the XYZ file

Returns:

`Structure`

Return type:

`Structure object extracted from file

classmethod from_trj_xyz(trj_file, /, *, charge=0, multiplicity=1, comment_symbols=None, n_struc_limit=None)

Function for reading a xyz trajectory file and converting it to a list of molecular Structure

Parameters:
  • trj_file (Path | str | PathLike[str]) -- Name or path to xyz file with one or multiple structure(s)

  • charge (int, default: 0) -- Charge of the molecule

  • multiplicity (int, default: 1) -- Electron spin multiplicity of the molecule

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the xyz file. User comments are skipped before the actual xyz data starts. By default, no user comments are used. White-space only comments are not allowed and are silently ignored.

  • n_struc_limit (int | None, default: None) -- If >0, only read the first n structures.

Raises:
  • FileNotFoundError -- If the XYZ file cannot be found.

  • ValueError -- If there is a problem with parsing the XYZ file.

  • EOFError -- If the file is empty.

Returns:

list[Structure]

Return type:

Molecular structure objects extracted from the xyz file.

classmethod from_xyz_block(xyz_string, /, *, charge=0, multiplicity=1)

Function for reading a xyz file from a string and converting it to a molecular Structure

Parameters:
  • xyz_string (str) -- String that contains xyz file data

  • charge (int, default: 0) -- Charge of the molecule

  • multiplicity (int, default: 1) -- Electron spin multiplicity of the molecule

Raises:

ValueError -- If there is a problem with parsing the XYZ file

Returns:

The Structure object extracted from file

Return type:

Structure

classmethod from_trj_xyz_block(trj_string, /, *, charge=0, multiplicity=1, comment_symbols=None, n_struc_limit=None)

Function for reading a XYZ trajectory data string and converting it to a list of molecular Structure

Parameters:
  • trj_string (Path | str | PathLike[str]) -- String that contains multiple xyz blocks (trajectory data)

  • charge (int, default: 0) -- Charge of the molecule

  • multiplicity (int, default: 1) -- Electron spin multiplicity of the molecule

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the xyz file. User comments are skipped before the actual xyz data starts. By default, no user comments are used. White-space only comments are not allowed and are silently ignored.

  • n_struc_limit (int | None, default: None) -- If >0, only read the first n structures.

Returns:

list[Structure]

Return type:

`Structure objects extracted from file

Raises:

EOFError -- If the string is empty

classmethod from_xyz_buffer(xyz_lines, *, charge=0, multiplicity=1, comment_symbols=None)

Function for reading a xyz file from a buffer and converting it to a molecular Structure.

Parameters:
  • xyz_lines (TrackingTextIO) -- A buffer that contains xyz file data

  • charge (int, default: 0) -- Molecular charge of the structure.

  • multiplicity (int, default: 1) -- Electron spin multiplicity of the structure.

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the xyz file. User comments are skipped before the actual xyz data starts. By default, no user comments are used. White-space only comments are not allowed and are silently ignored.

Returns:

The Structure object extracted from the buffer.

Return type:

Structure

Raises:
  • ValueError -- When no valid structure can be read from the input buffer.

  • EOFError -- When no data is in the buffer.

classmethod from_smiles(smiles, /, *, charge=None, multiplicity=None)

Function to read SMILES string and convert string to 3D coordinate structure and create Molecule object with it to store in self.molecule.

Parameters:
  • smiles (str) -- SMILES string to be converted

  • charge (int | None) -- Charge of the molecule, will overwrite charge obtained from SMILES string

  • multiplicity (int | None) -- Electron spin multiplicity of the molecule, will overwrite multiplicity obtained from SMILES (which is always 1 by default)

Returns:

`Structure`

Return type:

`Structure object extracted from file

Raises:

RuntimeError -- If EmbedMolecule() is unsuccessful

classmethod from_rdkitmol(mol, /, *, charge=None, multiplicity=None)

Function to convert a RDKit Mol object to Structure object

Parameters:
  • mol (RdkitMol) -- RDKit Mol object to be converted

  • charge (int | None) -- Charge of the molecule, will overwrite charge obtained from RDKit Mol

  • multiplicity (int | None) -- Electron spin multiplicity of the molecule, will overwrite multiplicity obtained from RDKit Mol

Returns:

`Structure`

Return type:

Structure object created from information given by RDKit Mol object

to_rdkitmol(structure, /)

Function to convert Molecule object to RDKit Mol object The Structure is converted into XYZ file format, which is then read by RDKit.

Parameters:

structure (Structure) -- Molecule object to be converted

Returns:

RdkitMol

Return type:

RDKit Mol object generated from Structure object

__len__()
Return type:

int

classmethod from_ase(ase_atoms, *, charge=None, multiplicity=None)

Function to generate Structure from Atoms object from the Atomic Simulation Environment (ASE). Since ORCA and OPI do not support structures with periodic boundary conditions these are ignored.

Parameters:
  • ase_atoms (AseAtoms) -- The object "Atoms" from ase

  • charge (int | None, default: None) -- Optional charge of the molecule, will overwrite charge from ase.

  • multiplicity (int | None, default: None) -- Optional multiplicity of the molecule, will overwrite multiplicity from ase.

Returns:

The Structure object generated from AseAtoms object

Return type:

Structure

Raises:

ValueError -- If the ASE object does not include a usable structure.

classmethod from_lists(symbols, coordinates, charge=0, multiplicity=1)

Function for generating the Structure object from symbols and position lists. They are required to have the same length and need fulfill the typing.

Parameters:
  • symbols (list[str | int]) -- List of symbols for elements, either as string or as atomic number

  • coordinates (list[tuple[float, float, float]]) -- List of tuples containing coordinates

  • charge (int, default: 0) -- Optional charge for the structure

  • multiplicity (int, default: 1) -- Optional multiplicity for the structure

Returns:

The Structure object initialized from given lists.

Return type:

Structure

classmethod _iter_xyz_structures(tracked, charge=0, multiplicity=1, comment_symbols=None, n_struc_limit=None)

Yield properties from the buffer until exhausted or the limit is reached.

Parameters:
  • tracked (TrackingTextIO) -- A buffer that contains XYZ file data.

  • charge (int, default: 0) -- Optional charge for the structure

  • multiplicity (int, default: 1) -- Optional multiplicity for the structure

  • comment_symbols (str | Sequence[str] | None, default: None) -- List of symbols that indicate user comments in the XYZ data. User comments have to start with the given symbol, fill a whole line, and come before the actual XYZ data.

  • n_struc_limit (int | None, default: None) -- If >0, only read the first n structures.

Returns:

Iterator of Structure object extracted from the buffer.

Return type:

Iterator["Structure"]

Raises:

ValueError -- If n_struc_limit is negative or zero.

class opi.input.structures.BaseStructureFile(file, /, *, charge=0, multiplicity=1, strict=True)

Bases: abc.ABC

Class to model structure file. The structure file is directly passed to ORCA. This interface does not read or modify the contents of the file.

Parameters:
  • file (pathlib.Path | str | os.PathLike[str])

  • charge (int)

  • multiplicity (int)

  • strict (bool)

_type

Type of the structure file. Use as prefix for ORCA input: *<_type>file

Type:

str

_type: str
_strict = True
_file: pathlib.Path
property file: pathlib.Path
Return type:

pathlib.Path

charge = 0
multiplicity = 1
format_orca(working_dir, /, *, original_path=False)

Format respectively line in ORCA input. If the file lies within the working directory a relative path is used. If no working directory is set just the filename is employed.

Parameters:
  • working_dir (Path | None) -- Path to the working directory.

  • original_path (bool, default False) -- If True, the original path will not be altered.

Raises:

ValueError -- If relative path is requested and cannot be resolved.

Return type:

str

copy_to(dest, /)

Copy the structure file to the destination.

Parameters:

dest (Path) -- Copy the self.file to dest. dest can point to a file or a folder. For details see documentation of shutil.copy().

Raises:

OSError -- If structure file cannot be copied.

Returns:

True if the file was copied, False otherwise.

Return type:

bool

class opi.input.structures.GzmtFile(file, /, *, charge=0, multiplicity=1, strict=True)

Bases: BaseStructureFile

Class to model .gzmt structure file.

Parameters:
  • file (pathlib.Path | str | os.PathLike[str])

  • charge (int)

  • multiplicity (int)

  • strict (bool)

_type = 'gzmt'
class opi.input.structures.PdbFile(file, /, *, charge=0, multiplicity=1, strict=True)

Bases: BaseStructureFile

Class to model .pdb structure file.

Parameters:
  • file (pathlib.Path | str | os.PathLike[str])

  • charge (int)

  • multiplicity (int)

  • strict (bool)

_type = 'pdb'
class opi.input.structures.XyzFile(file, /, *, charge=0, multiplicity=1, strict=True)

Bases: BaseStructureFile

Class to model .xyz structure file.

Parameters:
  • file (pathlib.Path | str | os.PathLike[str])

  • charge (int)

  • multiplicity (int)

  • strict (bool)

_type = 'xyz'