ESAB - European Society of Applied Biocatalysis - Expert curated knowledge resources for biocatalysis

Newsletter ▪ SIGN IN

Expert curated knowledge resources for biocatalysis

Abstract

Expert curated knowledge resources are helping to power an AI revolution in biocatalysis, with the emergence of new methods to predict enzyme structure and function from sequence and design entirely new enzymes and pathways never seen in nature. In the first part of this talk we will explore some of the knowledge resources for biocatalysis developed in our group, including the UniProt Knowledgebase (UniProtKB, at www.uniprot.org), a reference resource of protein sequences and functional annotation covering over 240 million protein sequences from all branches of the tree of life, and Rhea (www.rhea-db.org), an expert curated knowledgebase of biochemical reactions based on the chemical ontology ChEBI (www.ebi.ac.uk/chebi/). While AI methods have enormous potential for biocatalysis research - and indeed almost all fields of scientific endeavour - they ultimately rely on expert curated knowledgebases like UniProtKB, Rhea, and others, to provide a reliable ground truth for training and benchmarking, but biocuration resources are scarce, and we cannot cover the whole scientific literature. Large Language Models (LLMs) such as GPT-4 and others may provide one route to better scaling expert curation, by automatically extracting structured knowledge from the scientific literature, but are themselves prone to errors or “hallucinations”. In the second part of this talk we will look at the development of a new curated domain-specific literature dataset for LLMs and other NLP methods EnzChemRED, which can boost the ability of LLMs to extract knowledge of enzyme functions from publications (https://arxiv.org/abs/2404.14209). These approaches may help us realize our ultimate goal, which is to capture the entire literature on enzyme functions in FAIR open knowledgebases like UniProtKB and Rhea.

About the Speaker(s)

speaker Dr. Alan Bridge is director of the Swiss-Prot group at the SIB Swiss Institute of Bioinformatics. A biologist by training, he joined SIB in 2004 as a biocurator following post-doctoral studies at the Swiss Institute for Experimental Cancer Research (ISREC). He is a co-principal investigator of the UniProt resource of protein sequences an functional annotation, and principal investigator for the Rhea knowledgebase of biochemical reactions, the ENZYME resource for enzyme nomenclature, the SwissLipids knowledgebase for lipids and lipidomics, and the PROSITE and HAMAP resources for protein classification and annotation.

Under the Spotlight

Webinars 2021-24 on the Digital Platform
ESAB Webinars from 2021 till now are finally available on the ESAB Digital Science and Technology Platform on the subscription basis. Choose between 30-Days- and 1-Year-Subscription.
26 November 2024

Latest news

Jobs & Positions
The new website area "Jobs & Positions" is online. These pages can only be accessed by the ESAB members. You are kindly invited to apply for ESAB Membership.
6 April 2023
ESAB Institutional Membership
Institutional membership has been established as a new membership category. Academic, governmental, research and other public Institutions as well as private companies based inside or outside Europe and whose activities are related to the field of applied biocatalysis, are welcome to apply for Institutional Membership.
24 February 2023