Welcome to CatLearn’s documentation!¶
CatLearn provides utilities for building and testing atomistic machine learning models for surface science and catalysis.
Note
This is part of the SUNCAT centers code base for understanding materials for catalytic applications. Other code is hosted on the center’s Github repository.
CatLearn provides an environment to facilitate utilization of machine learning within the field of materials science and catalysis. Workflows are typically expected to utilize the Atomic Simulation Environment (ASE), or NetworkX graphs. Through close coupling with these codes, CatLearn can generate numerous embeddings for atomic systems. As well as generating a useful feature space for numerous problems, CatLearn has functions for model optimization. Further, Gaussian processes (GP) regression machine learning routines are implemented with additional functionality over standard implementations such as that in scikit-learn. A more detailed explanation of how to utilize the code can be found in the Tutorials folder.
To featurize ASE atoms objects, the following lines of code can be used:
import ase
from ase.cluster.cubic import FaceCenteredCubic
from catlearn.featurize.setup import FeatureGenerator
# First generate an atoms object.
surfaces = [(1, 0, 0), (1, 1, 0), (1, 1, 1)]
layers = [6, 9, 5]
lc = 3.61000
atoms = FaceCenteredCubic('Cu', surfaces, layers, latticeconstant=lc)
# Then generate some features.
generator = FeatureGenerator(nprocs=1)
features = generator.return_vec([atoms], [generator.eigenspectrum_vec,
generator.composition_vec])
In the most basic form, it is possible to set up a GP model and make some predictions using the following lines of code:
import numpy as np
from catlearn.regression import GaussianProcess
# Define some input data.
train_features = np.arange(200).reshape(50, 4)
target = np.random.random_sample((50,))
test_features = np.arange(100).reshape(25, 4)
# Setup the kernel.
kernel = [{'type': 'gaussian', 'width': 0.5}]
# Train the GP model.
gp = GaussianProcess(kernel_list=kernel, regularization=1e-3,
train_fp=train_features, train_target=target,
optimize_hyperparameters=True)
# Get the predictions.
prediction = gp.predict(test_fp=test_features)
There is much functionality in CatLearn to assist in handling atom data and building optimal models. This includes:
- API to other codes:
- Atomic simulation environment API
- Magpie API
- NetworkX API
- Fingerprint generators:
- Bulk systems
- Support/slab systems
- Discrete systems
- Preprocessing routines:
- Data cleaning
- Feature elimination
- Feature engineering
- Feature extraction
- Feature scaling
- Regression methods:
- Regularized ridge regression
- Gaussian processes regression
- Cross-validation:
- K-fold cv
- Ensemble k-fold cv
- Optimize:
- Machine Learning Accelerated Nudged Elastic Band ML-NEB
- General utilities:
- K-means clustering
- Neighborlist generators
- Penalty functions
- SQLite db storage
- Installation
- Changelog
- Version 0.6.1 (April 2019)
- Version 0.6.0 (January 2019)
- Version 0.5.0 (October 2018)
- Version 0.4.4 (August 2018)
- Version 0.4.3 (May 2018)
- Version 0.4.2 (May 2018)
- Version 0.4.1 (April 2018)
- Version 0.4.0 (April 2018)
- Version 0.3.1 (February 2018)
- Version 0.3.0 (February 2018)
- Version 0.2.1 (February 2018)
- Version 0.2.0 (January 2018)
- Version 0.1.0 (December 2017)
- Contributing
- catlearn.api
- catlearn.cross_validation
- catlearn.featurize package
- catlearn.fingerprint package
- Submodules
- catlearn.fingerprint.adsorbate module
- catlearn.fingerprint.bulk module
- catlearn.fingerprint.chalcogenide module
- catlearn.fingerprint.convoluted module
- catlearn.fingerprint.graph module
- catlearn.fingerprint.molecule module
- catlearn.fingerprint.particle module
- catlearn.fingerprint.prototype module
- catlearn.fingerprint.standard module
- catlearn.fingerprint.voro module
- Module contents
- catlearn.ga
- catlearn.learning_curve
- catlearn.preprocess
- catlearn.regression
- catlearn.regression.gpfunctions
- catlearn.regression.gpfunctions.covariance
- catlearn.regression.gpfunctions.default_scale
- catlearn.regression.gpfunctions.hyperparameter_scaling
- catlearn.regression.gpfunctions.io
- catlearn.regression.gpfunctions.kernel_scaling
- catlearn.regression.gpfunctions.kernel_setup
- catlearn.regression.gpfunctions.kernels
- catlearn.regression.gpfunctions.log_marginal_likelihood
- catlearn.regression.gpfunctions.sensitivity
- catlearn.regression.gpfunctions.uncertainty
- catlearn.regression.cost_function
- catlearn.regression.gaussian_process
- catlearn.regression.ridge_regression
- catlearn.regression.scikit_wrapper
- catlearn.regression.gpfunctions
- catlearn.active_learning package
- catlearn.estimator package
- catlearn.optimize package
- Submodules
- catlearn.optimize.constraints module
- catlearn.optimize.convergence module
- catlearn.optimize.functions_calc module
- catlearn.optimize.get_real_values module
- catlearn.optimize.io module
- catlearn.optimize.mlneb module
- catlearn.optimize.tools module
- catlearn.optimize.warnings module
- Module contents
- catlearn.utilities