AIM-AHEAD AIHEC Data Science Training Institute (DSTI) Summer Program
Purpose
The AIHEC – Summer Data Science Training InstituteProficiency in data science, artificial intelligence and machine learning (AI/ML) are essential for advancing research and health interventions in American Indian/Alaskan Native (AI/AN) communities. With the use of data science and AI/ML principles, researchers can investigate the impact of non-clinical health factors such as: cardiovascular disease, diabetes, substance abuse, limited physical activity, and cancer on indigenous communities. To address this need, Tribal Colleges and Universities (TCUs) faculty will identify how to start or expand data science capabilities and applications at TCUs by participating in the Summer Data Science Training Institute (DSTI) to gain comprehensive training in programming fundamentals and critical data analytics skills to support data science, AI/ML and healthcare curricula at TCUs.
Participants
The Summer DSTI is designed to support the planning and execution of research involving the Behavioral Risk Factor Surveillance System (BRFSS) datasets, with a focus on health issues relevant to indigenous communities. During the 8-week program, top-tier AI/AN professionals will engage TCU faculty to develop a working knowledge of data science principles with participation in practical hands-on labs in Python programming; predictive analytics in research with a focus on current AI/AN probing non-clinical health factors by utilizing BRFSS data sets combined with case studies in artificial intelligence and machine learning (AI/ML). Guided by AI/AN data scientists, faculty will explore advanced algorithmic techniques, advancements in technology and computational tools to build data science programs at TCUs.
Summer DSTI Learning Objectives
- Master core concepts of Python programming: syntax, variables, data types, and control structures, and writing simple scripts.
- Introduce data wrangling concepts such as cleaning, transforming, and reshaping data using Python libraries such as Pandas.
- Navigate datasets with a focus on creating meaningful visualizations to uncover non-clinical health factors.
- Introduce statistical inference and hypothesis testing, emphasizing their importance in health data analysis.
- Differentiate AI and ML and explain their significance in healthcare data types, opportunities and challenges deploying AI in healthcare. Apply basic AI/ML techniques to real-world healthcare datasets and scenarios.
- Calculate and interpret z-scores and t-statistics, with a focus on understanding and describe the relationship of effect size and statistical power to sample size. Analyze A/B tests, ensuring reliable conclusions from the participant’s analyses to support data driven decision making. Conduct an A/B test on a real-world use case, covering all necessary processes from designing the experiment to interpreting the results.
- Conduct supervised learning techniques, linear regression, classification, model evaluation metrics, and foundational algorithms.
- Perform unsupervised learning techniques such as clustering and dimensionality reduction, concepts like overfitting, and model selection. Hands-on experience inbuilding and evaluating models using tools like Scikit-Learn.
Impact
Based on the training priorities identified in collaboration with TCU faculty, the virtual Summer DSTI will be offered as an 8-week training program in foundation building data science principles, concepts, and skills. Trainees will meet once per week with the instructors and subject matter experts in data science and non-clinical health factors research. The summer DSTI would focus on beginners with no prior programming experience required. Designed for working TCU faculty, the Summer DSTI combines live synchronous sessions, self-paced videos, office hours to support learners with limited programming skills, and a concierge help desk. The program offers abundant hands-on exercises in the Codio environment, where participants without coding skills apply AI concepts. Beyond the 8-week Summer DSTI, TCU faculty will been encouraged and supported to benefit from their participation in other AI/ML, data science, and health informatics trainings, fellowships, access resources available on AIM-AHEAD Connect, and engage in mentorship opportunities.
Program Inquiries
For questions related to the program, please contact the Helpdesk at AIHEC Helpdesk.