ML Atomic Potentials¶

Using MatBench Discovery Models¶

Matbench Discovery is an interactive leaderboard which ranks popular ML models for inter-atomic potentials. Definition files for the top 20 models [1] can be found in the Images directory. These are also all setup with corresponding config files in Container_Configs/MatBench_Discovery such that you can simply refer to them by name.

To use a model, for example the current number two model, eSEN-30M-OAM, we first need to load the model with:

ml-toolkit load eSEN-30M-OAM

This will download some files and create a Container image eSEN-30M-OAM.sif in the Images directory. Note: this only needs to done the first time you use the model.

Once this is complete we can run something in the container using:

ml-toolkit run eSEN-30M-OAM $COMMAND

Where $COMMAND is the command you would like to run inside the container.

A brief note on using Models from MetaAi:¶

Unfortunately, due to MetaAi’s licencing terms you need both a Huggingface account and an API key to be able to download model checkpoints for use with ml-toolkit. Furthermore they have gone out of there way to prevent automation.

Thus if you attempt to build/use any of the models tagged MetaAi. that is any of the following:

eqV2-L-DeNS
eqV2-M-DeNS
eqV2-S-DeNS
eqV2-S
eSEN-30M-MP
eSEN-30M-OAM
eqV2-S-OAM
eqV2-M-OAM
eqV2_M
eqV2-L-OAM
eSEN-30M-OMat
eqV2-S-OMat
eqV2-M-OMat
eqV2-L-OMat
omat24
uma-s-1
uma-s-1p1
uma-m-1p1
uma

you will likely be greeted with an error similar to the following:

********************************************************************************
************************** Loading Model Config Files***************************
********************************************************************************
                        All config files look good
********************************************************************************
*** You have asked for a model that requires a HuggingFace API key to build.****
**** This needs to be provided in: ${ML_TOOLKIT_HOME}/API_Keys/HF_AUTH.key  ****
************************* See the docs for more details*************************

In which case you will need to do some manual setup by following the instructions under Using Models from MetaAi

Using the Atomic Simulation Environment¶

In this next example we will predict the atomic potential of a simple hydrogen molecule.

The vast majority of Models listed on Matbench Discovery can be used through python via the Atomic Simulation Environment (ASE).

In principle to get the predicted atomic potentials we simply need to setup an ASE calculator then use the get_potential_energy() method. The problem, however,is every model uses it’s own calculator with there own setup steps.

To make things easier in the Scripts directory we have provided a python function initialise_model() that takes in a string that is the name of the model you wish to use and returns an appropriate ASE calculator for that particular ML model.

As such the following is an example of how to predict and extract the potential of a hydrogen molecule and using ASE and a model from MatterSim:

# Example script to get potential of H2 molecule from with MatterSim through ASE
# This should be run inside the MatterSim container.

from Matbench_Models import Get_ASE_Calculator
from ase import Atoms
from ase.optimize import BFGS
from ase.calculators.nwchem import NWChem
from ase.io import write

# Setup the system with ASE, in this case a simple H2 molecule
h2 = Atoms('H2', positions=[[0, 0, 0],[0, 0, 0.7]])

# Tell ASE to use MatterSim as a Calculator
h2.calc = Get_ASE_Calculator('MatterSim_V1_5M')

# Do the calculations
opt = BFGS(h2)
opt.run(fmax=0.02)
write('H2.xyz', h2)
h2.get_potential_energy()

This should be run with the ml-toolkit as:

# change directory to to wherever ml-toolkit is installed
cd $ML_TOOLKIT_HOME
# This only needs to be done the first time
ml-toolkit build MatterSim_V1_5M

ml-toolkit run MatterSim_V1_5M python3 $ML_TOOLKIT_HOME/Scripts/Examples/H2_MatterSim.py

From here you could do some further analysis with ASE or convert the data for use with another tool. In our case we will move on to calculating charge density using CASTEP.

Interfacing with CASTEP¶

CASTEP can use an external routine to perform energy, force and stress calculations using ML models in two distinct ways:

The “File” method
The “Server” method

At this stage users should note that both of these methods are still considered experimental. As such they use a developer keyword. The current plan is to release them as an official keyword in CASTEP 27.

The File Method¶

Conceptually, this is the simplest approach. Whenever CASTEP need to update the energy, forces, and stresses it writes the current geometry to a .cell file. It then calls an external process. This process reads in the .cell file evaluate the energy, force and stress, writing the results to a .geom file. CASTEP then reads in this .geom file and carries on it merry way.

We note that Whilst it’s simple, this approach has a non-trivial overhead and likely will not scale well to more than a few mpi-processes. As such it is only really recommended for small scale testing. It is however the only option available if you are using CASTEP version 25 or older.

The Server Method¶

The method is new to CASTEP V26 (the latest version at the time of writing). It involves running a stand-alone application which both: evaluates the energy, force and stress; and acts as a network server. CASTEP can then talk to this application over the network and request updated values for energy, force and stress as needed.

This is more a bit more complex to setup but has three major advantages:

You can run multiple servers at once allowing you to easily scale over many mpi mpi-processes.
I uses the network for communication so the overhead will be minimal.
It’s more robust, the server and CASTEP are completely independent. As such if one or the other encounters a problem and crashes they can easily be restarted and start back from an earlier point in time.

CASTEP Examples¶

Please note this is not an extensive tutorial for CASTEP. We assume users already have some familiarity with CASTEP. Thus we will not be covering the basics of how to use it, or any of the theory behind these calculations.

If you need this CASTEP already has extensive documentation and we will adapting several of the tutorial examples to work with ML potentials. For users on Bede CASTEP is licenced on a project by project basis.

If your project has a licence there will be a folder available under /projects/{PROJECT_CODE}/castep26 containing copy CASTEP 26 complied for grace-hopper (if not you will need to ask the bede admin team for access). For anyone else you will need to install a recent version of CASTEP (preferably Version 26).

Charge density of Si lattice (file method)¶

We will start with a simple calculation used in the CASTEP charge density tutorial. It consists of a Si lattice made from a 3 atom unit cell. The CASTEP .cell file should be as follows:

%block lattice_abc
3.8 3.8 3.8
60 60 60
%endblock lattice_abc
!
! Atomic co-ordinates for each species.
! These are in fractional co-ordinates wrt to the cell.
!
%block positions_frac
Si 0.00 0.00 0.00
Si 0.25 0.25 0.25
%endblock positions_frac
!
! Analyse structure to determine symmetry
!
symmetry_generate
!
! Specify M-P grid dimensions for electron wave-vectors (K-points)
!
kpoint_mp_grid 4 4 4

and the .param file:

xc_functional : LDA
cutoff_energy : 1500 eV
#this next line causes the charge to be written in a den_fmt file
write_formatted_density : T
spin_polarised : false

Before using any ML models we recommend running this as is to get the silicon.den_fmt to compare against. We also recommand renaming this to file to make comparisons easier.

Users on bede will want to create a slurm script test_castep.sh to run on the compute nodes using the following template:

#!/bin/bash
# Example SLURM script for CASTEP
# with the ml-toolkit
##########################################
#SBATCH --account CHANGE_ME              # charge job to specified account
#SBATCH --cpus-per-task 1                # number of cpus required per task
#SBATCH --job-name castep_test           # name of job
#SBATCH --ntasks 1                       # number of processes required
#SBATCH -o castep_test.out               # File to redirect program output
#SBATCH -e castep_test.err               # File to redirect any errors
#SBATCH --time 10                        # time limit (in mins)
#SBATCH --partition ghtest               # Choose either gh or ghtest
#SBATCH --gres gpu:1                     # Number of GPUs required for the job
##########################################
# source the python virtual environment
source ~/.venv/ml-toolkit/bin/activate
module load castep
# change directory to where Si.param and Si.cell are located
cd /location/of/si.param
castep.serial Si

you can then submit the batch script with

sbatch castep_test.sh

Users on there own machines can just run CASTEP as you normally i.e.

# for serial
castep.serial Si
# or mpi parallel
mpirun -n 4 castep.mpi Si

Modification to work with ML Potentials¶

To modify this to work with an ML potential we you first we need to ensure you have activated your python virtual env if you have not already done so. [2] In the Scripts directory for ML_toolkit you will find two files: predict.sh and ML_file.py these will be used by CASTEP to call ml-toolkit at the appropriate time and tell it which model to use.

Next we need to modify the param file by adding the following devel_code block to the end of the Si.param file.

%block devel_code

PP=T

PP:

    # Run external script method
    EXT=T

    # Don't delete extra files generated for communication with external script
    EXT_CLEANUP=F

    # Command used to call the external script used to predict energy/stress/forces
    EXTCALL: ${ML_TOOLKIT_HOME}/Scripts/predict.sh MatterSim :ENDEXTCALL

ENDPP:

%endblock devel_code

This uses the pair-potential (PP) devel_code to tell CASTEP to get energy, stress and forces using an external script. The line EXT=T tells CASTEP to call an external command. EXT_CLEANUP=F tells it not cleanup the extra files at the end. You may wish to set this to true for larger runs to avoid clogging up the file system.

The final line of interest is EXTCALL: ${ML_TOOLKIT_HOME}/Scripts/predict.sh MatterSim :ENDEXTCALL This defines the command run by CASTEP each time it needs the energy, stress and forces. In this case it uses predict.sh in the ML_Toolkit/Scripts directory. You will also notice this defines which model to use. This can be any model you like, for this example we have chosen MatterSim.

Before running CASTEP we next need to ensure we have built an appropriate container for our ML model. As previously discussed we have chosen one from MatRIS for this example.

ml-toolkit build MatRIS_10M_OAM

Once this is complete we can run CASTEP as we did previously i.e.

sbatch castep_test.sh

If on Bede or

# for serial
castep.serial Si
# or mpi parallel
mpirun -n 4 castep.mpi Si

At this point you can compare the charge density to the original using Vesta as per the original tutorial. For users on Bede you will need to copy the files to your local machine and do the analysis there as Vesta is not available.

Bandstructure of Graphene (server method)¶

Next we’ll try something a bit more interesting the following .cell and .param files setup a calculation of the Bandstructure for a 1D Hexagonal lattice of carbon atoms, a.k.a Graphene.

#Graphene.param
task : spectral  !bandstructure
spectral_task : bandstructure

# Exchange correlation
xc_functional : PBE

# Plane wave cutoff
cut_off_energy : 500 eV

# SCF settings
max_scf_cycles : 100
elec_energy_tol : 1e-6

# Smearing (graphene is semi-metal)
smearing_width : 0.1 eV
smearing_scheme : gaussian

# Diagonalization
mixing_scheme : Pulay
mix_charge_amp : 0.5

# Output options
write_bands : true
write_orbitals : false

# Geometry
spin_polarized : false

# graphene.cell
%BLOCK LATTICE_CART
ang
2.460000  0.000000  0.000000
-1.230000  2.130422  0.000000
0.000000  0.000000 15.000000
%ENDBLOCK LATTICE_CART


%BLOCK POSITIONS_FRAC
C 0.000000 0.000000 0.000000
C 0.333333 0.666667 0.000000
%ENDBLOCK POSITIONS_FRAC


KPOINT_MP_GRID 12 12 1
KPOINT_MP_OFFSET 0 0 0


# High symmetry path for graphene band structure
%BLOCK SPECTRAL_KPOINT_PATH
0.0  0.0  0.0   ! Gamma
0.333333 0.333333 0.0   ! K
0.5  0.0  0.0   ! M
0.0  0.0  0.0   ! Gamma
%ENDBLOCK SPECTRAL_KPOINT_PATH


SPECTRAL_KPOINT_PATH_SPACING 0.02

After running this with CASTEP and plotting the resulting graphene.bands output file you should get output similar to the following:

example bandstructure for Graphene produced by CASTEP. Note: we used the dispersion.pl plotting script with the -sym=hexagonal parameter to get the correct labels for the path though the brillouin zone.¶

We will now modify the .parm file to tell CASTEP to get potentials, forces and stresses from an external server by adding a devel_code block to the .param file.

Note: the server method will only work for CASTEP version 26 users of older versions of CASTEP will need to either use the previously mentioned file method or upgrade to version 26.

# required for socket interface
socket_port 5000
socket_host 127.0.0.1

%BLOCK devel_code
PP=T
PP:
    SOC=T
:ENDPP
%ENDBLOCK devel_code

Once again, we are using the Pair-potential (PP) devel_code. However, first we use the socket_port and socket_host keywords to tell CASTEP where to find the external server on the network. In this case socket_port 5000 tells CASTEP to listen on network port 5000 [3]. socket_host 127.0.0.1 tell CASTEP the ip address [4] of the server on the local machine.

The remaining lines inside the devel_code block simply enables the PP devel_code and tells CASTEP to get potentials, forces and stresses: from an external process on the network.

Once the param file has been modified we next need to build a container with the ml-toolkit. In this example we will use one of the MetaAi Omat24 models eSEN-30M-OAM [5]

ml-toolkit build eSEN-30M-OAM

Once this is complete if you are on Bede you can use the following bash script as a template:

#!/bin/bash
# Example SLURM script for CASTEP
# with the ml-toolkit
##########################################
#SBATCH --account CHANGE_ME              # charge job to specified account
#SBATCH --cpus-per-task 1                # number of cpus required per task
#SBATCH --job-name castep_test           # name of job
#SBATCH --ntasks 1                       # number of processes required
#SBATCH -o castep_test.out               # File to redirect program output
#SBATCH -e castep_test.err               # File to redirect any errors
#SBATCH --time 10                        # time limit (in mins)
#SBATCH --partition ghtest               # Choose either gh or ghtest
#SBATCH --gres gpu:1                     # Number of GPUs required for the job
##########################################
# source the python virtual environment
source ~/.venv/ml-toolkit/bin/activate
module load castep
# change directory to where Graphene.param and Graphene.cell are located
cd /location/of/graphene.param
ml-toolkit start eSEN-30M-OAM
castep.serial Graphene
ml-toolkit stop eSEN-30M-OAM

or if running you are running locally you can just run the following to start ml-toolit as a server in the background. Then run CASTEP and finally shut down the server. Note: you may need to use the sleep command to wait for a few seconds between starting the server and CASTEP as the server can take a short while to actually start-up.

# change directory to where Graphene.param and Graphene.cell are located
cd /location/of/graphene.param
ml-toolkit start eSEN-30M-OAM
castep.serial Graphene
ml-toolkit stop eSEN-30M-OAM

Once complete you should get a number of output files. These include Graphene.bands which should produce a bandstructure plot similar to the following:

example bandstructure for Graphene produced by CASTEP using the eSEN-30M-OAM model for potential, forces and stresses.¶

Note: we used the dispersion.pl plotting script with the -sym=hexagonal parameter to get the correct labels for the path though the brillouin zone.

One final thing to note are the files eSEN-30M-OAM.out and eSEN-30M-OAM.err. These contain the output and any errors from the python server. They should (hopefully) be blank as there is no output by default. However in the event of any errors these, along with files in the TOOLKIT_HOME/logs directory are a good first place to look.

Additional Advanced Options¶

The server method has a few additional options that can be used to allow for more flexibility and help improve performance when running over many cpu cores with mpi. These options can be passed in as command line parameters to the ml-toolkit start command.

-p, --port PORT_NUM sets the TCP port used for network communication. Useful if port 5000 is already in use on your network. must be greater than 1024 default is port 5000.
-t, --timeout TIME Server timeout in seconds. The server will be restated if no communication occurs for set amount of time. This is useful for redundancy if the ML model should crash. Must be greater than 0, default is 60 seconds.
-N, --Nservers NUM_SERVERS Number of python servers to spawn, for best performance set this to the number of mpi processes, must be greater than 0. Default is 1
-r NUM_RETRY, --num_retry NUM_RETRY number of times to retry when waiting for python server. Default: 5
-T TASK, --task TASK Task to perform, required for all Meta UMA and selected SevenNet models, ignored by all others.

an example of how to use this is as follows:

ml-toolkit start eSEN-30M-OAM -N 6 -p 4124 -t 70

In this case we have asked for 6 severs communicating via TCP port 4124 each with a timeout of 70 seconds.

Si GA (server method)¶

Our final example is taken from the GA tutorial in the CASTEP docs. It consists of using a genetic algorithm to predict the most stable structure for a Si lattice (the correct answer for which is a diamond structure).

In this case the cell file already uses the PP devel code so it is easy enough to add in the extra bits needed to use an ML potential.

For reference the original .cell file is:

%block LATTICE_ABC
ang
3.8 3.8 3.8
90  90  90
%endblock LATTICE_ABC

%block POSITIONS_FRAC
Si  0.203  0.617  0.209
Si  0.844  0.442  0.350
Si  0.964  0.379  0.096
Si  0.762  0.524  0.941
Si  0.544  0.605  0.781
Si  0.238  0.597  0.531
Si  0.728  0.914  0.742
Si  0.209  0.929  0.435
%endblock POSITIONS_FRAC

%BLOCK SPECIES_POT
QC5
%ENDBLOCK SPECIES_POT

symmetry_generate
symmetry_tol : 0.05 ang

and the .param file is:

task             = genetic algor # Run the GA
ga_pop_size      = 12            # Parent population size
ga_max_gens      = 12            # Max number of generations to run for
ga_mutate_amp    = 1.00          # Mutation amplitude (in Angstrom)
ga_mutate_rate   = 0.15          # Probability of mutation to occur
ga_fixed_N       = true          # Fix number of ions in each member based on input cell

rand_seed        = 101213 # Random seed for replicability
opt_strategy     = SPEED  # Run quick

geom_max_iter    = 211 # Can have a large max iter as using pair potentials

# Don't write most output files for each population member
write_checkpoint = NONE
write_bib        = FALSE
write_cst_esp    = FALSE
write_bands      = FALSE
write_cell_structure = TRUE

######################################
# Any extra devel code options       #
# & required GA specific devel flags #
######################################

%block devel_code

# Command used to call CASTEP for each population member
# If not given this defaults to castep.serial
CMD: castep.serial :ENDCMD

GA:

    PP=T   # Using a pair potential
    IPM=M  # Randomly mutated initial population

    CW=24  # Num gens for convergence

    NI=F   # No niching
    FW=0.5 # Fitness weighting

    # Asynchronous running options
    # Required for asynchronous running, without this all geom opts will be run
    # one after another
    AS=T   # Run geometry optimisations asynchronously
    MS=3   # Run 3 geometry optimisations at once

    # Random symmetry children
    NUM_CHILDREN=12
    RSC=F

    CORE_RADII_LAMBDA=0.8 # Core radii 0.8 pseudo-potential radii

    SCALE_IGNORE_CONV=T   # Ignore convergence in fitness calculation

:ENDGA

# Use pair potentials in geometry optimisations and perform a final snap to symmetry
GEOM: PP=T SNAP=T :ENDGEOM

# Use the Stillinger-Weber pair potential
PP=T
PP:
    SW=T
:ENDPP

%endblock devel_code

all we need to do is modify lines 61-63 to add in SOC=T to the PP devel_code block: the keywords to define the socket interface.

PP:
    SW=T
    SOC=T
:ENDPP

then outside the devel_code block. (i.e. before line 24 %block devel_code or after line 65 %endblock devel_code) add in the extra keywords to define the network settings.

socket_port 5000
socket_host 127.0.0.1

From there to should be able to run CASTEP with an ML model in the same way as the previous example. i.e. If you are on Bede you can modify the batch script then submit with sbatch or with the following if you are running locally.

# change directory to where Graphene.param and Graphene.cell are located
cd /location/of/Si_GA.param
# in this case we are using a model from NequIP
ml-toolkit start NequIP-OAM-L
castep.serial Si_GA
ml-toolkit stop NequIP-OAM-L

One final note about Meta UMA models¶

The models from Meta UMA are not officially a part of Matbench discovery but have been included as they are Meta’s latest offering and thus may be of interest. Somewhat Annoyingly they are hybrid models i.e. each model has sub-models for performing specific tasks. Thus they have a slightly different workflow to the OMAT24 models.

As such when you start any of the following models:

uma-s-1
uma-s-1p1
uma-m-1p1
uma

You need to tell the model what task to perform. Thus when using them with CASTEP you will need to add an extra cmd argument -T / --task as follows:

# change directory to where .param and .cell are located
cd /location/of/<molecule>.param
ml-toolkit start uma-s-1 --task <taskname>
castep.serial <molecule>
ml-toolkit stop uma-s-1

where <taskname> can be any of:

oc20: use this for catalysis
oc22: use this for oxide catalysis (1p2 only)
oc25: use this for (electro)catalysis (1p2 only)
omat: use this for inorganic materials
omol: use this for molecules+polymers
odac: use this for MOFs
omc: use this for molecular crystals