Foundations of Biomedical Data Science and Machine Learning (Graduate Level)
The curriculum begins with Module 1: Hypothesis Testing, which lays the groundwork for statistical analysis in biomedical data. It starts with an introduction to Python, essential for the practical components of the course, followed by reviews of probability and statistics to refresh and solidify foundational knowledge. Students learn various hypothesis testing methods, including parametric and non-parametric statistics, the considerations for multiple comparisons, and resampling-based statistical techniques. Advanced methods in hypothesis testing are also covered, ensuring students are equipped with both traditional and modern statistical tools. Module 2: Regression and Model Fitting delves deeper into the quantitative methods used to fit models to biomedical data. It covers likelihood functions, maximum likelihood solutions, basics of optimization, and linear and logistic regression. This module emphasizes the practical implementation of these methods using Python, exploring regularization techniques and Bayesian perspectives on regularization. Students also learn about model validation techniques such as AIC, generalizability, and cross-validation to evaluate the efficacy and robustness of their models. Module 3: Classification and Clustering introduces students to key concepts and algorithms for categorizing and grouping data, essential for many biomedical applications such as gene expression profiling or patient stratification. This includes studying types of errors, ROC curves, linear discriminant analysis, Bayesian classifiers, and clustering methods like K-means and hierarchical clustering. Advanced methods in classification and clustering are also introduced to handle complex datasets typical in biomedical research. Module 4: Dimensionality Reduction teaches techniques to reduce the number of random variables under consideration. It begins with a review of linear algebra concepts critical for understanding dimensionality reduction techniques like principal components analysis (PCA) and factor analysis. Students revisit regression and expectation maximization in the context of dimensionally reduced data, exploring how these techniques can enhance the interpretation and performance of machine learning models. Finally, Module 5: Neural Networks covers the fundamentals and programming of neural networks, a crucial element in modern biomedical data analysis, especially in imaging and genetic data interpretation. The module covers gradient descent, optimization, backpropagation, and automatic differentiation. Students learn to implement neural networks using Python, focusing on understanding neural network loss functions and concepts from information theory.
Written for
- Institution
-
Boston University
- Course
-
ENG BE 559
Document information
- Uploaded on
- June 6, 2024
- Number of pages
- 72
- Written in
- 2023/2024
- Type
- Class notes
- Professor(s)
- Brian depasquale
- Contains
- All classes
Subjects
- machine learning
- regression
- lasso
- ridge
- neural networks
- perceptron
- regularization
-
classification
-
clustering
-
model fitting
-
gradient descent
-
supervised learning
-
unsupervised learning
-
gaussi
-
likelihood
-
o