Workshops

g4191

Welcome! We run a number of educational workshops primarily focused on statistical data analysis with freely available materials (note: this page is still a bit under construction):

hidden statworkshop link

Intermediate/Advanced Statistical Analysis Workshops

Ongoing workshops in advanced analysis topics, using R. For more information, contact Mike Rieger (mrieger @ wustl . edu). Topics include:

Hypothesis Testing: Linear models with interactions and post-hoc comparisons in R

– understanding contrast coding, the design matrix, and the output of lm()
– using the multcomp package to print summaries of post-hoc comparisons & adjusted significance
– interactions between continuous and categorical variables
– models with fixed & random effect terms
In this 5 week session you will become comfortable with:
Working directly with the design matrix
Working directly with the variance-covariance matrix
Performing linear algebraic operations in R
Different approaches to multiple comparisons correction
Computing, interpreting, and reporting confidence intervals for regression coefficients & mean differences

Data/Materials

Session Tutorial #1

Two way interactions 1.pdf

rgbeta.R Simulation Function for ESS score data: rgbeta.R

Simulated ESS Score data: ESSData.tab

multcomp package

broom package

car package

Additional examples for the multcomp package (super useful): https://cran.r-project.org/web/packages/multcomp/vignettes/multcomp-examples.pdf

Session Tutorial #2: Two Way Interactions with Categorical & Continuous Variables

Simulated Lever Press Data from Ozburn et al. 2012 Psychopharmacology: ozburn2012.tab

Discovery Approaches: Linear models with unknown numbers of predictors

– understanding predictors and multiple collinearity of predictors
– model selection criteria: Bayesian probability and marginal likelihood approaches
– algorithms for model selection: enumeration & random walks
– dimensionality reduction in the predictor space: principal component analysis (PCA)
– dimensionality reduction in both the predictor & the response space: partial least squares regression (PLSR)
– summary statistics, what to report, what is the best model?
– prediction using Bayesian model averaging
In this 5 week session you will build on the previous and become comfortable with:
Rotating and transforming matrices in R
Scaling approaches for variables
Simple introduction to Bayesian probability
Working with model likelihood
Criteria for model selection: R^2, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Bayes Factor
Scree plots, loading plots, scores plots to evaluate PCA and PLSR techniques
Image result for under construction

Numerical Approaches: Empirical sampling methods

– basics of Monte Carlo sampling and computing confidence intervals for a sample statistics
– Markov Chain Monte Carlo methods (MCMC)
– Monte Carlo methods for linear models
– MCMC methods for non-linear models
– Enzyme kinetics
– Exponential growth models
– Count data
– Survival curves
 Image result for under construction

 

hidden ustar link

uSTAR R Workshops

Introductory data analysis workshops using R for uSTAR scholars, a scholarship program at Washington University in St. Louis for undergraduate students in STEM fields who come from historically underrepresented backgrounds (click link to read more). Currently this is a two-part series 11/30/2016 & 12/7/2016 but may continue in the future depending on program interest. We have had numerous uSTAR scholars in our lab over the years, so if you are interested in our work feel free to contact us.

Datasets for 11/30/2016

NOAA.gov GSOD

Global Summary of Day Data were downloaded and processed using these scripts: download.all.yearsprocess.and.format.gsod.R, and the full FTP directory listing for GSOD data at NOAA.gov. We will use basic R base functions to learn the basic features of variable assigment and operations, linear algebra, and learn to do basic data exploration and plotting.

Global Summary of Day Weather Station Data (NCEI Climate Data Server)

200460-99999.output.tab Polargmo Krenkelja (Arctic)
213580-99999.output.tab Zohova Island (Arctic)
361260-99999.output.tab Bajanaul Station (Bayanaul, Kazakhstan)
431850-99999.output.tab Machilipatnam (India)
610240-99999.output.tab Agadez (Niger)
619020-99999.output.tab Ascension Station (Southern Atlantic Ocean)
893240-99999.output.tab Byrd Station (Antarctica)
949960-99999.output.tab Norfolk Island (South Pacific)

To download, right click and choose Save Link As… Or download all at once with this zip file.

 

Datasets for 12/7/2016

Downloading Genomic Sequence Data using the BSgenome R package

 

Image result for under construction