Input File Documentation

This page contains a guide to CREST input files that can be used with program versions >3.0.

CREST 3.0

CREST program instructions via the various command line arguments can become quite lengthy and tedious. Therefore, following version 3.0 of CREST, input files will be available. Currently, the input files are based on the TOML format and are parsed using TOML-F.

CREST input files can be loaded with the --input command

crest struc.xyz --input input.toml

or simply be given as the first argument (the file extension .toml is mandatory)

crest input.toml

where the input.toml would look something like this

# CREST 3 input file
input = "struc.xyz"   
runtype="ancopt"
threads = 9

[calculation]
elog="energies.log"

[[calculation.level]]
method = "gfn2"
uhf = 0
chrg = 0

As can be seen from this example, the file is hierarchically structured. At the top level, things like the input coord file name, runtype, and parallelization are specified. The calculation group (defined by [ ]) includes some settings about the internal calculation settings and printouts, while its level subgroup (defined by [[ ]]) provides the actual method and calculation information.

Some more input file example can be found here:

Go to Example Input Files

The documentation of blocks and keywords can be found in the following.

Note: Command line arguments that can be found in the Keyword Documentation will overwrite the settings read from CREST input files.

Hierarchical structure of CREST input files

General settings
[calculation] block
1. [[calculation.level]] sub-blocks
2. [[calculation.constraints]] sub-blocks
[dynamics] block
1. [[dynamics.meta]] sub-blocks
[cregen] block

Important: The following lists not extensive and will be expanded over time.

General settings

These settings are not part of any block and can be specified at the beginning of an input file.

Key	Values / Description
`input`	Specify the atomic input coordinate file as a string
`input_ensemble`	Specify an ensemble input file as a string
`threads`	Specify the number of CPU threads to be used as an integer
`bin`, `binary`	Specify a `xtb` binary as a string. Used for legacy runtypes of CREST. This is equivalent to the `--xnam` cmd argumnet. For new integrations use the `binary` option within the `[calculation.level]` block.
`runtype`	Select the CREST runtype, specify as a string. The possible values are: `none` - do nothing `singlepoint`,`sp` - perform a single calculation for the input structure `ancopt`,`optimize` - optimize the input structure `numhess` - numerical calculation of second derivatives `ancopt_ensemble`,`optimize_ensemble`,`mdopt` - optimize the input ensemble, similar to the `--mdopt` function. `screen_ensemble`,`screen` - optimize the input ensemble in a multistep procedure and sort, similar to the `--screen` function. `md`,`mtd`,`dynamics`,`metadynamics` - perform a (meta)dynamics simulation. `imtd-gc` - Standard conformational sampling algo based on metadynamics `nci-mtd`,`nci` - perform sampling with a wall potential (NCI_MTD workflow) `entropy`,`imtd-smtd` - perform extensive sampling targeting configurational entropy
`constraints`	Specify a file with the `xtb`-style structure constraints to be included in the calculation.
`preopt`	Activate/Deactivate pre-optimization. Specify as boolean (`true`/`false`)
`topo`	Activate/Deactivate topology checks. Specify as boolean (`true`/`false`)

`[calculation]` block

The [calculation] block contains information on how to get energies and gradients for all other interfaces, i.e., specification on which programs to run and how to process the input/output data from a given list of [[calculation.level]] objects (see below ). This block also contains settings for optimizations.

Key	Values / Description
`type`	Instruction on how to process energies and gradients. Can be specified as string or integer. Possible values are: any integer > 0 - Select the respective `[[calculation.level]]` block (see below ) to be used (if multiple have been defined). By default the first one is taken. `mecp` - Take the first two specified levels and average energy and gradients.
`elog`	Specify a file as a string to which energies are logged, e.g., in each optimization step.
`eprint`	Activate/Deactivate the energy printout via `elog`. Specify as boolean (`true`/`false`)
`opt_engine`	Select geometry optimization algorithm as a string. `ancopt` - Use the ANCOPT algorithm (RFO with internal coordinates) `rfo` - Use a rational function algorithm (Cartesian coordinates) `gd` - Use a simple gradient descent algorithm (Cartesian coordinates)
`hess_update`	Select the Hessian update method for ANCOPT as a string. Note, that for regular optimizations with ANCOPT only BFGS works well. `bfgs` - Use the default BFGS update `powell` - Use the Powell update method `sr1` - Use the symmetric rank one (SR1) update method `bofill` - Use the Bofill type update `schlegel` - Use the Farkas-Schlegel type update
`maxcycle`	Specify maximum optimization cycles an integer
`optlev`	Specify default settings/convergence conditions in geometry optimization (see Tab. IV of https://doi.org/10.1063/5.0197592). Pre-defined levels are `crude`, `vloose`, `loose`, `normal`, `tight`, `vtight`, `extreme` and must be provided as a string
`converge_e`	Specify energy convergence criterium for geometry optimization as a real in Hartree
`converge_g`	Specify gradient norm/RMS force convergence criterium for geometry optimization as a real in Hartree/Bohr
`freeze`	Provide a list of atoms which shall be entirely frozen in geometry optimization and MD simulations. The atom list should be given as a string in the atom list format

`[[calculation.level]]` sub-blocks

The [[calculation.level]] sub-blocks contain actual information about employed levels of theory, the used programs, and system specific data such as the molecular charge or number of α and β electrons.

Key	Values / Description
`method`	Specify the method or type of theory to be used in this calculation as a string. This will instruct CREST on the format of energies and gradients that shall be read. Possible values are: `tblite` - Select `tblite` as calculation backend, should be used in combination with the `tblite_level` argument `gfn2` - Quick selection of GFN2-xTB via `tblite` `gfn1` - Quick selection of GFN1-xTB via `tblite` `gfnff` - Select GFN-FF via the gfnff-submodule project `gfn0` - Select GFN0-xTB via the gfn0-submodule project `gfn0*` - Select a special GFN0-xTB calculator used for MECP calculations (see https://doi.org/10.1021/acs.jpclett.3c00494 `xtb`,`gfn`,`gfn-xtb` - Select GFNn-xTB method calculations performed via the `xtb` program. Should be used in combination with the `binary` option within this block. However, this setting is not generally recommended because it will be much slower than the `tblite` backend. `orca` - ORCA subprocesses. Requires to use the arguments `orca_cmd` as well as `orca_template` in addition to this argument. `generic` - Call a generic script. The script should process the coordinates that crest writes into a file `genericinp.xyz` and you must know how to obtain the gradient (see options `gradtype` and `gradfile` below)`
`bin`, `binary`	Select the program/binary/script name to be executed by CREST in order to generate energies and gradients. Can be a full path. Specify as a string. Should not be confused with the `bin` command in the main block, nor with the `--xnam` functionality via the command line settings. If addressing `xtb` via this option, include all command line arguments like `-alpb` to this like you would call the binary on its own.
`dir`, `calcspace`	Specify the directory in which CREST shall perform this calculation as a string. Note, this is can be a relative OR absolute path to the directory.
`chrg`, `charge`	Specify the molecular charge as an integer.
`uhf`	Specify multiplicity information as an integer. For `xtb` calculations this number must be Δn = N_α - N_β electrons.
`rdwbo`	Activate/Deactivate reading of bond orders for each singlepoint at the chosen level. Specify as boolean (`true`/`false`)
`rddip`	Activate/Deactivate reading of molecular dipole moments for each singlepoint at the chosen level. Specify as boolean (`true`/`false`)
`dipgrad`	Activate/Deactivate reading of the Cartesian gradient of the molecular dipole moments for each singlepoint at the chosen level. Specify as boolean (`true`/`false`)
`gradfile`	Name the file from which each singlepoint in the `generic` method interface obtains the energy and gradient information. Specify as string
`gradtype`	Name the gradient file format for each singlepoint in the `generic` method interface. Specify as string. Available options are: `engrad` - the .engrad format used by e.g. xtb and ORCA.

`[[calculation.constraints]]` sub-blocks

The [[calculation.constraints]] sub-blocks are used to introduce constraints. Constraints are calculated by CREST and added to the energies and gradients.

Key	Values / Description
`bond`, `bonds`	Introduce automatic bond constraints either as a string keyword, or with a mixed-type list. Available values are: `all` - put a constraint on all (automatically identified) bonds
`sphere`	Define a spherical wall potential around the system. The argument is a list of reals of the format `[ a, b, c]`, where a is the potential prefactor, b is the exponent, and c is the radius (in atomic units, i.e., Bohr).
`sphere_logfermi`	Define a spherical logfermi-type wall potential around the system. The argument is a list of reals of the format `[ a, b, c]`, where a is the logfermi temperature in K, b is the exponent factor, and c is the sphere radius (in atomic units, i.e., Bohr).
`gapfiff`	Introduce a simple constraint to the gap between two potentials (`[[calculation.level]]` objects) in the MECP mode. The argument is a list of reals of the format `[ σ, α]`, where σ is a potential prefactor and α is a confinement parameter.
`mecp`, `gapfiff2`	Introduce a modified constraint to the gap between two potentials (`[[calculation.level]]` objects) in the MECP mode. The argument is a list of reals of the format `[ σ, α, c]`, where σ is a potential prefactor, α is a confinement parameter, and c is a shift in the exponential scaling function.

`[dynamics]` block

The [dynamics] block is used to define basic settings for CRESTs standalone molecular dynamics and metadynamics module. Note, that some [calculation] must have been defined.

Key	Values / Description
`length`	Set the simulation length in ps. The argument is specified as a real.
`tstep`	Set the time step in fs. The argument is specified as a real.
`dump`	Set the trajectory snapshot dump frequency in fs. The argument is specified as a real.
`hmass`	Set the hydrogen mass in amu. The argument is specified as a real. Increasing the hydrogen mass helps the simulation to run more stable.

`[[dynamics.meta]]` sub-blocks

The [[dynamics.meta]] sub-block is used to define metadynamics parameters for a MD simulation in CREST. Multiple metadynamics potentials can be defined (as separate [[dynamics.meta]] sub-blocks) and added to the same MD ([dynamics] block).

Key	Values / Description
`type`	Set the metadynamics type with regards to the employed collective variable. Specify the argument as a string. Available types are: `rmsd` - Use the Cartesian RMSD between the snapshot and a reference structure list as collective variables.
`alpha`	Set the exponent of the Gaussian metadynamics potential. Specify as a real.
`kpush`	Set the Gaussian metadynamics potential prefactor in E_h. Specify as a real.
`dump`,`dump_fs`, `dump_ps`	Specify the reference structure dump frequency for RMSD-based metadynamics in fs (or ps for `dump_ps`) as a real.

`[cregen]` block

The [cregen] block is used for defining global options related to the ensemble sorting procedures. For more information on the CREGEN procedure see our recent publication in J Chem Phys.

Key	Values / Description
`ewin`	Set the total energy window for CREGEN relative to the lowest energy of structures in the ensemble. Specify as real in kcal/mol. Default: 6.0
`rthr`	Set the Cartesian RMSD threshold for distinguishing two conformers. Specify as real in Angstroem. Default: 0.125
`ethr`	Set an energy threshold for distinguishing two conformers. Specify as real in Angstroem. Default: 0.05
`bthr`	Set a rotational constant threshold for distinguishing two conformers. Specify as real in MHz. Default: 0.01
`eqv`, `nmr`	Try to determine nuclear equivalencies from the ensemble, e.g. for NMR applications. Specify as boolean

Example Input Files - Examples for CREST input files in the TOML format.

Input File Documentation

Hierarchical structure of CREST input files

General settings

[calculation] block

[[calculation.level]] sub-blocks

[[calculation.constraints]] sub-blocks

[dynamics] block

[[dynamics.meta]] sub-blocks

[cregen] block

Table of contents