PhD+ Statistical Analysis in R (Pilot Offering in Fall 2022)

This series is developed through collaboration with UVA Health’s Health Sciences Library. Sessions are scheduled to meet weekly through lecturing, hands-on exercises, and group discussions to help participants become familiar and comfortable conducting statistical analyses in R. Upon successful completion of this series, PhD students are eligible for a non-credit credential on their academic transcript (*required attendance for at least 4 out of 5 synchronous sessions). 

In this series, participants will learn to conduct and interpret output from basic statistical approaches using the statistical programming language, R. Using real research data from the life sciences, we will introduce the concept of each technique, discuss its assumptions, learn what to do when assumptions are not met, evaluate the model fit, interpret the output, and visualize the results. Participants from any discipline are welcome to register and attend. While the examples are drawn from life sciences research, participants will be able to conduct analysis and apply learning to data on any topic.

Participants should commit to attend at least the first three sessions to lay the foundation for the later sessions. Sessions will be recorded to accommodate occasional absence. However, participants must be available to attend live sessions to qualify for the PhD+ non-credit credential.

Pre-Requisite

Participants MUST have working knowledge of R and RStudio, preferably with previous experience using tidyverse packages (dplyr and ggplot2). This series’ pre-requisite may be met through any of the following:

  • Attendance at Health Sciences Library’s 4 workshops in R (offered monthly)
  • Participation in at least 3 sessions of PhDPlus Data Literacy in R series (offered Spring semester annually)
  • Participation in Brain Immunology and Glia Center Learn R series (offered Fall annually)
  • Completion of a curricular course using R (within the past 3 years)
  • Independent learning of R and instructor approval

Session Dates

Virtual format Oct 19, Oct 26, Nov 2, Nov 9, Nov 16 (Wednesdays), 10 am – 12 pm (ET)

* Registration closes on Oct 14, and instructors will evaluate prerequisites and make announcements by Oct 17.

Please hold dates on your calendar after registration. If your availability has changed, please email Dr. Yi Hao ([email protected]) immediately.

Topics 

  • Linear regression

In this session learners will review fundamental tidyverse code so everyone starts on the same page being able to conditionally summarize and visualize a dataset. Learners will then explore the foundational concepts of linear regression. Learners will become proficient in conducting, interpreting, and visualizing results of single variable linear regression models.

  • ANOVA

This session will extend learners’ knowledge of linear regression to cases with a single categorical independent variable. Learners will develop fluency interpreting the output of one-way ANOVA models by visualizing the results before learning about 2-way ANOVAs with and without an interaction term.

  • Assumptions of linear regression

Now that we have a handle on the basics of regression, this week learners will explore options for what to do when the assumptions of linear regression are not met. Learners will explore adding polynomial terms, splines regression, and bootstrapped estimation of regression coefficients.

  • Logistic regression

This session will explore logistic regression models, useful when the outcome variable is not a continuous variable, but rather is binary. Learners will connect these models to linear regressions they are familiar with and will spend much of their time making sure they understand the interpretation of the output.

  • Linear mixed effects models

In this final session, learners will improve their understanding of linear mixed effects models, useful when data are clustered or collected longitudinally. Learners will develop intuition about when these models may be useful, differences between linear regression models and these mixed models, and will gain experience visualizing the results and interpreting output from these models.

Instructor

Marieke Jones, Research Data Specialist, UVA Health Sciences Library

Registration Link

* Registration closes on Oct 14, and instructors will evaluate prerequisites and make announcements by Oct 17.

 

Core Module
Sub Category