Skip to main content

Self-driving computational laboratory for high-throughput screening of biosynthetic materials, powered by LUMI

VTT, the Technical Research Center of Finland, is developing a self-driving computational lab. It combines high-throughput Molecular Dynamics simulations (MD) on the LUMI supercomputer, structure generators, and a generative Deep Learning model trained on LUMI for PHA polymers into a complete optimization loop.

The self-driving computational lab is being developed at VTT’s ProperTune Soft Condensed Materials group. It will become part of the VTT Synbio-MAP platform, which combines biotechnology with physics-based models and AI tools to enable large-scale screening, discovery, and design.

Image: VTT

Developing bio-based plastics

Polyhydroxyalkanoates (PHAs) are a candidate to replace oil-based plastics with bio-based, biodegradable materials. These polymers can be produced with synthetic biology. This family of polymers is highly versatile as they are composed of over 150 different monomers. The properties of blends of these polymers are difficult to cover, even for a small fraction, with experiments or atomistic simulations due to the exponential possibilities. Currently, there are only 4 PHAs in commercial use. Acceleration and automation are needed.

The developed system is a human-supervised, self-learning autonomous lab that allows for the autonomous scanning of the wide PHA sequence-property space, optimizing PHAs for given properties and connecting several simulation and AI components into an automated workflow that optimally utilizes large computational resources.

The properties of the polymer blends are computed at an atomic detail using molecular dynamics simulations. GROMACS and LAMMPS are the leading HPC tools for this, and they are available for researchers on LUMI. Both of them have been integrated into the VTT Synbio-MAP platform, making their usage easier for the researcher. Eventually, they will be used in computing the PHA thermal, mechanical and rheological properties. A version focusing on material proteins and respective material applications is under implementation. Higher-level automation and procedure encapsulation will bring the HPC capacity accessible to a larger audience and help researchers use the resources optimally.

PHA polymer properties are affected by monomer backbone and side chain structure, entanglement of the polymer chains in the amorphous phase, and structure of the crystallites. Figure shows the amorphous structure (unwrapped coordinates) of 3-hydroxyhexanoate (3HHx).

Image: PHA polymer properties are affected by monomer backbone and side chain structure, entanglement of the polymer chains in the amorphous phase, and structure of the crystallites. Figure shows the amorphous structure (unwrapped coordinates) of 3-hydroxyhexanoate (3HHx). Copyright: VTT

The orchestration is done using the AiiDA workflow manager as the base of the system. It handles information flow between components. AiiDA has been used on LUMI from the beginning, and it performs well under very high-throughput production situations. Not surprisingly, it has been used in many large-scale Materials Acceleration Platforms (MAPs).

The VTT Synbio-MAP platform TOOL is based on a modular design making it extendable.  The core data calculated using physics-based models, as well as derived properties will be available for further analysis in new modules. New modules, like simulation and analysis tools, can be added to enhance capabilities, or reuse workflow control parts for other applications.

The self-running computational laboratory can drastically speed up the optimization of PHA materials for desired property combinations. Image: VTT

Image: The self-running computational laboratory can drastically speed up the optimization of PHA materials for desired property combinations. Copyright: VTT

Further information about used software:

GROMACS is the most used Molecular Dynamics software in the world. Several versions of it are available as preinstalled modules on LUMI with detailed running examples. In addition to life science use cases, like lipid membranes, proteins and drug-like molecules, GROMACS can be used to study materials, like polymers. GROMACS is an engine for efficiently converting computing capacity to large model system simulations thanks to its scalable implementation and algorithms, but also state- of- the art performance on high-throughput loads. Its permissive license makes it suitable for high-throughput industrial applications. Links: GROMACS home page: https://www.gromacs.org/ GROMACS on LUMI: https://docs.csc.fi/apps/gromacs/#lumi

LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) is a free and open-source molecular dynamics program developed by Sandia National Laboratories. It is suitable for high-throughput and massively parallel simulations on LUMI utilizing both CPU and GPU resources. It can use a variety of different force fields to accurately describe the interatomic forces. https://www.lammps.org/

AiiDA is an open-source Python infrastructure to help researchers with automating, managing, persisting, sharing and reproducing the complex workflows associated with modern computational science and all associated data. AiiDA was one of the software used in the pilot phase of LUMI and is developed by a large consortium of European sites and funded by EU projects. https://www.aiida.net/

BioExcel-CoE develops biomolecular simulation software, e.g., GROMACS and supports a wide user community, including collaboration with industrial customers. It offers a large portfolio of training materials on how to apply the software, e.g., using GROMACS efficiently on LUMI GPUs (https://zenodo.org/records/10683366), and even provides a direct link to the core developers. BioExcel also actively surveys the user community to identify the needs for new functionalities. For example, the Plumed functionality was recently integrated in GROMACS, which will, e.g., enable new opportunities to speed up sampling of the polymer conformational space. https://bioexcel.eu/

Authors: Atte Sillanpää, Development Manager at Science Support, CSC also BioExcel-CoE Training WP lead

Olli Pakarinen, Senior Scientist at VTT’s ProperTune Soft Condensed Materials group.

Anssi Laukkanen, Research Professor at VTT on Computational Materials and Data Sciences