Thinking outside the cell: Leveraging HuBMAP data to build the human ECM atlas - Project summary The extracellular matrix (ECM) is a complex meshwork of hundreds of proteins that constitute the scaffold that holds our cells together. However, the functions of the ECM extend far beyond its structural roles. ECM proteins provide biochemical signals, either directly, by binding to cell surface receptors, or indirectly, by modulating growth factor signaling, that regulate many essential pathways controlling cellular functions, from proliferation and survival to migration and differentiation, all key to tissue and organ functions. Alteration of the ECM is linked to many diseases, including congenital diseases (e.g., Marfan syndrome, Alport syndrome, Ehlers–Danlos syndrome), musculo-skeletal diseases (e.g., osteoarthritis, myopathies), cardiovascular diseases, fibrosis, and cancer. Yet, despite its importance, the ECM remains largely underexplored. For example, we have yet to decipher the ECM protein composition (or “matrisome”) of organs, of tissues, and, within tissues, of specialized niches. We also do not fully understand which cell types produced which ECM proteins, nor do we know how the composition of the ECM changes over time and during diseases. These gaps in knowledge are mainly due to the lack of adequate methods to study the ECM. The secretion and post-translational modifications that accumulate in the ECM over time are critical for proper ECM functions and cannot be fully studied by RNA-level observations only. Thus, protein-level evidence is key to understand the function and dynamics of the ECM. However, ECM proteins, being typically very large, heavily post-translationally modified, and, overall, highly insoluble, are under-represented in global proteomic datasets. We propose to fill these gaps in knowledge by contributing our expertise in ECM biology, ECM proteomics, and computational biology to the technology- development and mapping efforts of the Human BioMolecular Atlas Program (HuBMAP), and ultimately build spatially-resolved maps of the matrisome of all organs. To achieve this goal, we will pursue the following aims: 1) re-analyze the vast amount of single-cell RNA-seq data generated by HuBMAP to identify the cell populations expressing ECM and ECM receptor gene transcripts for all organs, 2) integrate existing imaging data and mass spectrometry data generated by the HuBMAP to build a model to predict protein co-expression and create spatially-resolved tissue maps of the ECM, 3) contribute our 10+ years of expertise in ECM proteomics to ensure the effectiveness of future data collection, to capture ECM-relevant information, by members of the HuBMAP. For our efforts to benefit the entire scientific community, we will deploy all datasets and technologies via the HuBMAP portal and via MatrisomeDB, the ECM protein knowledge database we have previously developed. This mapping effort will constitute a first step toward understanding the roles of the ECM in health and diseases and toward the development of future ECM-focused diagnostic and therapeutic strategies.