Assessing and improving the resilience of bioinformatics pipelines to data errors - Project summary Despite significant investments in data curation, errors and other analytical artifacts abound in public biological databases. Such errors can have an important impact on bioinformatics software, leading to analytical errors. In certain applications, such as diagnostics or other clinical settings, such analytical errors can have important consequences. The speed of data generation has far outpaced the speed with which biologists can experimentally characterize molecules and organisms, or curate computer-generated annotations. Simply put, errors cannot be eliminated in a practically-meaningful way. Furthermore, the problems induced by low data quality are not always apparent and, thus, can be overlooked by practitioners, even if they are experts in their own field. This project aims to develop a better understanding of how data interact with the algorithms used in bioinformatics research to create the potential for the analytical errors. This understanding will also lead to the development of algorithms and strategies that are able to protect analytical results from errors in the reference data or in the input provided to the analysis.