This course was originally written to teach researchers to analyse data using Stata, which was the analysis program of choice in the Centre for Epidemiology for many years. We are now encouraging the use of R within the Centre in parallel with Stata, and David Selby has kindly rewritten all of the Stata practicals so that they can be run in R. This is still a work in progress: not all Stata concepts map cleanly to R versions, so both the Stata and R versions will need some editing to produce something that is both meaningful and identical in both cases.

If you are following the course remotely, you may want to use the University's Remote Access PCs to run Stata on. There are instructions for doing this here, although they will obviously only work if you have a valid IT account with the University of Manchester.

Session Number Title Content Lecture Slides Handout Solution Solution Do-file R Version of Practical Solution to R Practical Datafiles Further Reading
1 Introduction to Stata
  • Stata's windows
  • Command Syntax
Lecture Slides Handout & Practical Solution Do file to produce solution Solution Using R
2 Summarising Data
  • Types of data
  • Graphical Summaries
  • Numerical Summaries
Lecture Slides Handout
Practical
Solution Do file to produce solution Practical Using R Solution Using R Height and Weight Dataset
3 Sampling and Confidence Intervals
  • Types of sampling
  • Estimation from random samples
  • Sampling Error
  • Reference Ranges
  • Confidence Intervals
  • Sample Size
Lecture Slides Handout
Practical
Solution Practical Using R Solution Using R
4 Hypothesis Testing and Power
  • Hypothesis Tests
  • Power calculations
Lecture Slides Handout
Practical
Solution Do file to produce solution Practical Using R Solution Using R Height and Weight Dataset
5 Linear Models 1
  • Assumptions
  • Interpretation
  • Inference
  • Goodness of Fit
  • Diagnostics
Lecture Slides Handout & Practical Solution Do file to produce solution Practical Using R Solution Using R Anscombe's Data
constvar.dta
Housing Data
Lifeline Data
Wood's '73 Data
6 Linear Models 2
  • Categorical variables
  • Interactions
  • Confounding
  • Variable Selection
Lecture Slides Handout & Practical Solution Do file to produce solution Practical Using R Solution Using R Cadmium Data
Growth Data
Hald Data
Soap Data
7 Modelling Binary Outcomes
  • Limits of Linear Regression
  • Generalised linear models
  • Logistic Regression
  • Logistic Regression Diagnostics
  • Sensitivity and specificity
  • Alternative Models
Lecture Slides Handout & Practical Solution Do file to produce solution Practical Using R Solution Using R CHD dataset
Pain dataset
8 Modelling Categorical Outcomes
  • Nominal Outcomes
    • Multinomial Regression
      Lincom
  • Ordinal Outcomes
Lecture Slides Handout
Practical
Solution Do file to produce solution Practical Using R Solution Using R Alligators Data
Housing Data
Politics Data
9 Modelling Counts
  • Poisson Regression
  • Constraints
  • Overdispersion
  • Negative Binomial Regression
Lecture Slides Handout
Practical
Solution Do file to produce solution Practical Using R Solution Using R Alligators Data
Negative Binomial Regression Data
Ships Data
10 Survival Analysis
  • Censoring
  • Survival Curves and Life Tables
  • Comparing Survival Curves
  • Parametric Regression
  • Cox Regression
Lecture Slides Handout
Practical
Solution Do file to produce solution Practical Using R Solution Using R Leukaemia Data
11 Refinements of the Stata Language
  • Graphics
  • Summarising Data
  • More Syntax
  • Looping
  • Reshaping data
Lecture Slides Handout & Practical Solution Do file to produce solution Practical Using R Solution Using R
Exam 2008
Exam 2009
Exam 2010
Exam 2011