DOGSLED: Data, Ontologies, and Graphs Supporting Learning and Enhanced Discovery - Abstract The NCATS Biomedical Data Translator (“Translator”) aims to augment human reasoning and accelerate scientific discovery through a federated system that integrates a broad range of biomedical data and knowledge, and reasons over them to answer translational science questions. During the Development phase (Phase II), the Translator program successfully implemented a system capable of answering certain types of clinical and translational questions. We propose advancements to make Translator an even more effective and compelling resource that will attract a broad and deep community of biomedical researchers. To achieve this transformation, we propose DOGSLED (Data, Ontologies, and Graphs to Support Learning and Enhance Discovery). DOGSLED will build on the best elements of the Phase II system—many of which were developed by members of our team—while improving breadth, integration, efficiency, explainability, usability, and sustainability. During Phase II of Translator, as members of the Ranking Agent, Exposures Provider, and Standards and Reference Implementation (SRI) teams, we worked with the Translator Consortium to build and integrate the ARAGORN Reasoning Agent, the ICEES Knowledge Provider, the Node Normalizer, and the Biolink Model. Building on that work, the DOGSLED team will collaborate with other proposed teams such as DOGSURF and ARAX-MGKG2, should they be awarded funding, to advance Translator to the next level, catalyzing user uptake and satisfaction. Our planned improvements center around performance, functionality, and transparency. Aim 1 (Create a Performant, Scalable, Reproducible Translator) involves improving reliability and performance by centralizing and unifying data ingest, data processing, and deployment in an integrated infrastructure component called BioPack. In addition to improving the efficiency of the system itself, this work will streamline and standardize the development process, reducing demands on future developers and making Translator more sustainable and extensible. To realize Aim 2 (Expand the Functionality of Translator), we will support new query types, leverage underutilized KPs, ingest or make better use of new and existing biomedical and clinical knowledge sources, and improve reasoning approaches. We will leverage large language models to enable users to add their own data in the form of publications and other text-based information as well as to query Translator using natural language. To achieve Aim 3 (Make Translator Fully Transparent to Users), we will track provenance at every stage, from initial data ingest all the way to ranked, evidence-supported answers to user queries. This will feed into improvements in answer scoring and will enable the system to provide better explanations to users. These advances will significantly expand the range of queries that users will be able to ask of the system, build confidence in the answers, improve system performance, and position Translator to keep pace with future developments in biomedical science. In concert with a multi-pronged user engagement and outreach strategy inspired by other successful consortia, the DOGSLED team will greatly expand Translator’s user base and help the program move toward its vision of Translator as a transformative scientific discovery tool used by a growing number of researchers.