Explainable Machine Learning for Identifying the Full Heterogeneity of Peptidoforms and Proteoforms
explAInProt aims to enhance proteomics by developing explainable, end-to-end machine learning models to identify undetected protein variants and improve clinical applications through advanced sequencing methods.
Projectdetails
Introduction
Mass spectrometry driven proteomics allows deep insights into the working of cells. Still, the vast majority of proteoforms, representing the full heterogeneity of molecular forms of protein products in a sample, currently remain undetected in proteomics experiments.
Limitations
This lack of information strongly restricts our knowledge of disease progression, possible biomarkers, and therapeutic targets across a large number of diseases. Several machine learning approaches have been developed for proteomics data, but not being trained end-to-end, they cannot capture the full wealth of proteomic mass spectra and commonly remain unexplained black boxes.
Project Goals
Within explAInProt, my team and I will develop representations of spectra that allow deploying explainable, end-to-end machine learning models on the wealth of proteomic data available, regarding both bottom-up and top-down spectra to identify novel protein variants.
Importance of Explanations
Explanations will allow identifying the origin of predictions and help reduce bias, building up the trustworthiness of AI systems required for clinical applications.
Verification Strategies
To verify results, we will pioneer orthogonal real-time strategies based on selective sequencing approaches and calling of amino acids that we will introduce for nanopore sequencing devices as a complementary acquisition method.
Expected Outcomes
All combined, this will allow us to drastically increase our knowledge about the current dark matter of mass spectrometry driven proteomics: those proteins and peptides that are non-canonically modified, non-tryptic, have potentially multiple amino acid substitutions, or no close match in databases or result from structural variants such as fusion proteins that remain undetected in current analyses.
Applicability
We will highlight applicability in two areas of particular concern in current approaches:
- The detection of structural variants in proteomic mass spectra
- The characterization of novel microbial organisms without sufficient database information.
Financiële details & Tijdlijn
Financiële details
Subsidiebedrag | € 1.992.500 |
Totale projectbegroting | € 1.992.500 |
Tijdlijn
Startdatum | 1-12-2024 |
Einddatum | 30-11-2029 |
Subsidiejaar | 2024 |
Partners & Locaties
Projectpartners
- HASSO-PLATTNER-INSTITUT FUR DIGITAL ENGINEERING GGMBHpenvoerder
Land(en)
Vergelijkbare projecten binnen European Research Council
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
MANUNKIND: Determinants and Dynamics of Collaborative ExploitationThis project aims to develop a game theoretic framework to analyze the psychological and strategic dynamics of collaborative exploitation, informing policies to combat modern slavery. | ERC STG | € 1.497.749 | 2022 | Details |
Elucidating the phenotypic convergence of proliferation reduction under growth-induced pressureThe UnderPressure project aims to investigate how mechanical constraints from 3D crowding affect cell proliferation and signaling in various organisms, with potential applications in reducing cancer chemoresistance. | ERC STG | € 1.498.280 | 2022 | Details |
Uncovering the mechanisms of action of an antiviral bacteriumThis project aims to uncover the mechanisms behind Wolbachia's antiviral protection in insects and develop tools for studying symbiont gene function. | ERC STG | € 1.500.000 | 2023 | Details |
The Ethics of Loneliness and SociabilityThis project aims to develop a normative theory of loneliness by analyzing ethical responsibilities of individuals and societies to prevent and alleviate loneliness, establishing a new philosophical sub-field. | ERC STG | € 1.025.860 | 2023 | Details |
MANUNKIND: Determinants and Dynamics of Collaborative Exploitation
This project aims to develop a game theoretic framework to analyze the psychological and strategic dynamics of collaborative exploitation, informing policies to combat modern slavery.
Elucidating the phenotypic convergence of proliferation reduction under growth-induced pressure
The UnderPressure project aims to investigate how mechanical constraints from 3D crowding affect cell proliferation and signaling in various organisms, with potential applications in reducing cancer chemoresistance.
Uncovering the mechanisms of action of an antiviral bacterium
This project aims to uncover the mechanisms behind Wolbachia's antiviral protection in insects and develop tools for studying symbiont gene function.
The Ethics of Loneliness and Sociability
This project aims to develop a normative theory of loneliness by analyzing ethical responsibilities of individuals and societies to prevent and alleviate loneliness, establishing a new philosophical sub-field.
Vergelijkbare projecten uit andere regelingen
Project | Regeling | Bedrag | Jaar | Actie |
---|---|---|---|---|
Learning Isoform Fingerprints to Discover the Molecular Diversity of LifeThis project aims to revolutionize proteomics by developing a novel data analysis strategy using deep learning to discover and quantify protein isoforms through their unique multi-dimensional fingerprints (ORIGINs). | ERC STG | € 1.498.939 | 2023 | Details |
A Native Mass Spectrometry Systemic View of Cellular Structural BiologyThis project aims to enhance native mass spectrometry for studying protein interactions and diversity in their natural cellular environments, advancing structural biology and related fields. | ERC ADG | € 2.954.167 | 2023 | Details |
Deep Spatial Proteomics: connecting cellular neighbourhoods to functional statesDeveloping Deep Spatial Proteomics (DSP) to link cellular neighborhoods to proteome states, aiming to uncover disease mechanisms and improve patient stratification in cancer immunotherapy. | ERC STG | € 1.470.851 | 2024 | Details |
Precise, Rapid and Scalable Proteomics Solutions for Archaeology, Ecology, Wildlife Forensics and Food-chain AuthenticationThe PReciSe project aims to develop a fast, cost-effective proteomics method for taxonomic identification to enhance archaeological, ecological, and food supply chain verification. | ERC POC | € 150.000 | 2025 | Details |
Learning Isoform Fingerprints to Discover the Molecular Diversity of Life
This project aims to revolutionize proteomics by developing a novel data analysis strategy using deep learning to discover and quantify protein isoforms through their unique multi-dimensional fingerprints (ORIGINs).
A Native Mass Spectrometry Systemic View of Cellular Structural Biology
This project aims to enhance native mass spectrometry for studying protein interactions and diversity in their natural cellular environments, advancing structural biology and related fields.
Deep Spatial Proteomics: connecting cellular neighbourhoods to functional states
Developing Deep Spatial Proteomics (DSP) to link cellular neighborhoods to proteome states, aiming to uncover disease mechanisms and improve patient stratification in cancer immunotherapy.
Precise, Rapid and Scalable Proteomics Solutions for Archaeology, Ecology, Wildlife Forensics and Food-chain Authentication
The PReciSe project aims to develop a fast, cost-effective proteomics method for taxonomic identification to enhance archaeological, ecological, and food supply chain verification.