High-Performance Computing for HIV Research Excellence (HPC4HIV) - PROJECT SUMMARY/ABSTRACT As the HIV epidemic has continued to evolve, so too have the HIV research priorities. In Uganda, those priorities have gravitated towards a better understanding of co-infections and comorbidities. Focus on these has necessarily led to an avalanche of genomic data for pathogens involved in opportunistic infections, social/behavioral data and the corresponding clinical and epidemiological electronic health records. That growth in volumes and complexity of data means that appropriate, well capacitated computational platforms are increasingly becoming an essential prerequisite for effective HIV research. Under this G11 grant, the African Center of Excellence in Bioinformatics and Data Intensive Sciences (ACE) seeks to strengthen its capacity to support Ugandan biomedical research in general and HIV Research in particular. We intend to build on initial NIAID support which installed a High Performance Computing (HPC) cluster, by training four local HPC support staff in various HPC competences. They will each receive in-person, hands-on training for three consecutive months at the Texas Advanced Computing Center (TACC), a partner HPC cluster that provides technical, hardware and strategic advice to ACE. Training competencies will involve; Data protection and systems security; HPC resource management; Performance analysis and tuning applications for HPC systems; HPC Software management; Linux/Unix systems administration; HPC system architectures; Managing Large datasets; and Parallel computing models. Following the completion of their training at TACC, trainees will be guided in the use of their new expertise to implement improvements and innovations that will bolster the performance and capacity of the HPC systems in Uganda. Finally, they will be embedded into HIV research teams at the Infectious Diseases Institute (IDI) to integrate HPC capabilities into local HIV research programs geared towards HIV research priorities. Specifically, they will; 1) learn about the nature of HIV data and its requisite analysis, 2) seek to identify the specific computational, algorithmic, storage and software needs of HIV research, 3) configure the infrastructure to meet those needs, 4) provide specialized training of the HIV researchers on best practices for HPC-based HIV data analysis. The HPC will thus be optimized to facilitate the processing and analysis of large- scale HIV longitudinal and program datasets, including genomic, epidemiological, and clinical data. Enhancing capacity in high performance computing will enable; i) comprehensive analysis of human and HIV genomes for screening of variants important to various HIV phenotypes, ii) use of molecular surveillance data to monitor and predict trends in HIV drug resistance, and iii) the application of data-intensive machine learning and AI algorithms on clinical, epidemiological and social data to predict trends or behaviors in key populations in HIV research. The skills, concepts and frameworks developed on the program will in the future provide a template to bring HPC platforms to bear on other local research, including cancer, TB, AMR and cardiovascular diseases. 1