Fuelling the Digital Revolution in Biocatalysis with Language Models


Abstract

One of the most important outcomes of organic chemistry is the creation of newly designed molecules. The application of domain knowledge gained through decades of laboratory experience has been critical in the synthesis of many new molecular structures. In the last years, natural language processing models have emerged as one of the most effective, scalable approaches for capturing human knowledge and modelling chemical processes in organic chemistry. Its use in machine learning tasks demonstrated high quality andease of use in problems such as predicting chemical reactions [1-2], retrosynthetic routes [3], digitizing chemical literature [4], predicting detailed  experimental procedures [5], designing new fingerprints [6] and yield redictions [7]. In this talk, I'll talk about the impact of language models in biocatalysis by highlighting the critical role of NLP models in facilitating the adoption of enzymatic strategies in organic chemistry synthesis [8].

References
[1]
Philippe Schwaller, Théophile Gaudin,Dávid Lányi, Costas Bekasa,Teodoro Lainoa,“Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 9, 6091-6098(2018).https://doi.org/10.1039/C8SC02339E.
[2]
Philippe Schwaller, Teodoro Laino, Théophile Gaudin,Peter Bolgar, Christopher A. Hunter, Costas Bekas, Alpha A. Lee,Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction.ACS Cent. Sci. 5, 9, 1572-1583(2019).https://doi.org/10.1021/acscentsci.9b00576.
[3]
Philippe Schwaller, Riccardo Petraglia, Valerio Zullo, Vishnu H. Nair, Rico Andreas Haeuselmann, Riccardo Pisoni, Costas Bekas, Anna Iuliano,Teodoro Laino,Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy.Chem. Sci. 11, 3316-3325(2020). https://doi.org/10.1039/C9SC05704H.
[4]
Alain C. Vaucher, Federico Zipoli, Joppe Geluykens, Vishnu H. Nair, Philippe Schwaller,Teodoro Laino, Automatedextraction of chemical synthesis actions from experimental procedures.Nat. Commun. 11, 3601
(2020)
.https://doi.org/10.1038/s41467-020-17266-6.
[5]
Alain C. Vaucher, Philippe Schwaller, Joppe Geluykens, Vishnu H. Nair, Anna Iuliano,Teodoro Laino, Inferring experimental procedures from text-based representations of chemical reactions.Nat. Commun. 12, 2573(2021).https://doi.org/10.1038/s41467-021-22951-1.
[6]
Philippe Schwaller, Daniel Probst, Alain C. Vaucher, Vishnu H. Nair,David Kreutter, Teodoro Laino,Jean-Louis Reymond,Mapping the Space of ChemicalReactions UsingAttention-Based Neural Networks.Nat. Mach. Intel. 3, 144152(2021).https://doi.org/10.1038/s42256-020-00284-w.
[7]
Philippe Schwaller, Alain C Vaucher, Teodoro Laino,Jean-Louis Reymond,Prediction of chemical reaction yields using deep learning.Mach. Learn.: Sci. Technol. 2(1), 015016(2021).https://doi.org/10.1088/2632-2153/abc81d.
[8]
Daniel Probst, Matteo Manica, Yves Gaetan Nana Teukam, Alessandro Castrogiovanni, Federico Paratore, Teodoro Laino,Biocatalysed synthesis planning using data-driven learning.Nat. Comm. 13, 964(2022). https://doi.org/10.1038/s41467-022-28536-w.


About the Speaker(s)

speakerTeo Laino received the Master degree in theoretical chemistry in 2001 (University of Pisa and Scuola Normale Superiore di Pisa, Italy) and the doctorate in computational chemistry in 2006 (Scuola Normale Superiore di Pisa, Italy) defending a thesis on 'Multi-Grid QM/ MM Approaches in ab initio Molecular Dynamics' supervised by Prof. Dr.  Michele Parrinello. From 2006 to 2008, Teo worked as a post-doctoral researcher in the research group of Prof. Dr. Jürg Hutter at the University of Zurich, contributing to the development of the CP2K simulation package. In 2008, Teo joined the IBM Research-Zurich Laboratory (ZRL) as Research Scientist. He is currently Distinguished  Research Scientist and manager. His research interests focus on developing machine learning/artificial intelligence technologies to digitalize chemistry and materials science, with IBM RXN for chemistry being an example of a recent community success. In 2022, the team received the Sandmeyer Award of the Swiss Chemical Society for the important contributions to the field of digital chemistry.


Latest from E-Congress
ESAB E-CONGRESS 2023


Under the Spotlight
Latest news