Enhanced Machine Learning Tools for Complex Data Evaluation and Integration in Advancing Health Outcomes - Abstract Heart, lung, blood, and sleep (HLBS) disorders affect millions of Americans and result in significant costs to the US healthcare system. These disorders also increase the risk of developing other detrimental health con- sequences, such as diabetes, depression, and metabolic disorder. Although a large amount of data has been generated, limited progress has been made in combating HLBS diseases. Proper analysis of these invaluable data is crucial for in-depth understanding of causal clinical and physiological bases regarding complex diseases, leading to more effective interventions and sustainable disease management and prevention strategies. However, reliable analysis of practical biomedical data can be highly challenging due to observational study designs, which often leads to complex confounding structures. Also, HLBS disease etiologies are intricate, with various clinical, metabolic, neurochemical, and immune-inflammatory factors entangling to impact disease phe- notypes. Additionally, patient population, disease, and drug response heterogeneity constitute another critical challenge. These complexities could result in biased, inconsistent or contradictory conclusions when conven- tional analytical tools are adopted to analyze practical biomedical data. To address these critical challenges, this proposal aims to adapt a set of robust data analytic and modeling tools based on novel machine learning methods. Specifically, this study will 1) propose novel machine learning methods for adjusting complex con- founding structures to reveal deep causal relationships between clinical features and health/disease outcomes; 2) build original deep learning hidden subgroup analysis frameworks to deal with the complicated heterogeneity for creating optimal health management regimens tailored to individual subject’s needs; and 3) establish a flexi- ble and scalable feature importance test framework to identify important biomarkers for improving disease/health outcome prediction performance. These methods will be applied to the analysis and modeling of data from the Hispanic Community Health Study/Study of Latinos, the most comprehensive study of Hispanic/Latino health and disease in the United States. Ultimately, this project will elucidate an array of clinical bases of HLBS disorders at the population and individual levels, leading to improved human health.