Characterizing subphenotypes of long COIVD in patients with sickle cell disease - Long-COVID has emerged as a serious long-term complication of COVID-19 affecting ~10% of adults in United States. Recent estimates suggest that approximately 7–8% of U.S adults have experienced long COVID at some point, with about 3–4% currently affected. While these percentages may seem modest, they translate to millions of individuals experiencing persistent symptoms such as fatigue, pain, and cognitive difficulties, often leading to substantial impairment in daily functioning and quality of life. The burden is even greater in a subsets of population individuals with preexisting health conditions such as sickle cell disease (SCD). Sickle cell disease is a life-threatening disease affecting 1 in 365 African Americans (AA). It is characterized by chronic hemolytic anemia, vaso-occlusive crisis, acute chest syndrome and multiorgan damage. Evaluating long COVID in patients with SCD is particularly challenging as many symptoms of long-COVID including fatigue, pain, and organ dysfunction overlap with chronic complications of SCD, making it difficult to distinguish between the two conditions. There is limited research, and a lack of validated tools specifically designed to assess long-term COVID-19 effects in the SCD population, leading to gaps in understanding and under-recognition of long-COVID in these patients. These knowledge gaps hinder our ability to implement targeted early interventions (e.g., surveillance, vaccine, antivirals) to mitigate the long-term impacts of COVID-19 in high-risk SCD patients. Consequently, the prevalence and morbidity caused by the long-COVID, the post-acute sequalae of SARS-CoV-2 (PASC), is unknown in SCD population. Proposed study will address these knowledge gaps using a data-driven strategy. Our main objective is to characterize the long-term outcomes of COVID in SCD patients with specific focus on long COVID. Our Aim 2 is to characterize the subphenotypes of long COVID in the AA with SCD and determine if differences exist compared to the general population. We leverage upon NIH initiated National COVID Cohort Collaborative (N3C), a harmonized EHR repository. This registry is representative of US population with ~40 million patients [SCD patients, 15,169; COVID positive SCD patients, 4,143]. For diagnosis and phenotyping of long COVID, we will conduct cluster analysis of the individuals in the SCD with COVID-19 cohort with a potential diagnosis of long COVID [ICD-10 U0.09] using unsupervised machine learning methods and the Human Phenotype Ontology-encoded EHR data. We will evaluate the association of cluster membership with a range of pre-existing comorbidities and with measures of acute COVID-19 severity. These analyses will identify clinical characteristics of the sub-phenotype clusters or groups of long-COVID in AA with SCD and compare it with general population. Knowledge gained from this analyses will advance precision diagnostics by applying ML, a subtype of artificial intelligence, to characterize subphenotypes of PASC in SCD population. Upon completion, our study will provide real-world data to guide both clinical practice and public policymaking for preventing and managing long COVID in SCD people. Future studies will focus on identifying the predictors of long COVID and understand how it affects SCD comorbidity. Collectively, these discoveries will improve management of long COVID in SCD population.