Building a data science workforce to improve the reproducibility of rehabilitation research - PROJECT SUMMARY Data science methods provide an exciting opportunity to significantly improve the reproducibility and replicability of rehabilitation research. However, a key barrier to implementing data science methods in rehabilitation research is a lack of fundamental programming knowledge. In particular, many clinical and basic scientists have not received formal training in programming skills required for data science. For this target audience, the most efficient way to get started in data science is to receive personalized guidance from an experienced mentor who can provide hands-on assistance and identify which skills are the most beneficial given a specific research goal. However, many rehabilitation researchers do not have access to these types of mentors. The proposed program (Reproducible Rehabilitation, or ReproRehab) addresses this need by providing personalized, rehabilitation-specific, hands-on training in data science skills with direct, weekly support from experienced mentors. The ReproRehab program will blend hands-on assistance, personalized mentorship, and a uniquely rehabilitation-focused curation of online resources for self-guided learning, with three specific aims. Aim 1 is to build a national workforce of rehabilitation researchers equipped to apply data science skills to their own rehabilitation research. To accomplish this Aim, over the 5-year grant a total of 100 learners will undergo a personalized, 6-month program consisting of a 2-month TA-guided, hands-on bootcamp in which learners are assigned to small groups with similar research needs, followed by a 4-month self-guided learning segment to integrate the skills into their own research. By the end of the program, learners will demonstrate the implementation of this knowledge into their own research, including but not limited to the sharing of open-source rehabilitation datasets, open-source analysis code or methods, and more rigorous research products. Aim 2 is to develop data science rehabilitation researchers who have the capacity to teach and train others. To accomplish this Aim, over the grant’s 5 years a total of 40 TAs will refine their teaching skills by administering bootcamps and providing hands-on training to learners. In addition, while the learners pursue self-guided learning, TAs will develop and host their own bootcamps in their rehabilitation communities, thereby fulfilling a train-the-trainer model for exponential growth and dissemination of data science skills. Aim 3 is to broadly disseminate knowledge by creating an online repository of curated, rehabilitation-specific data science resources, organized by rehabilitation research area, including program materials. To accomplish this Aim, the leadership team will develop a public web database of existing online data science resources, including training materials, public data archives, and all course materials from this program, organized by specific rehabilitation research areas and needs. Successful accomplishment of our aims will increase capacity in data science across the rehabilitation research community and will improve the rigor, reproducibility and replicability of research needed to move the field forward.