- August 19, 2020

STAT 371: Assignment 3 Total: 20 marks. Due: Thursday 20 August at 5:00PM. Instructions: the assignment is to be completed in LATEX (ideally) or another word-processer with thought given to presentation as well as correctness, per the course information. The use of knitr is recommended but not essential. Question 1 (6 marks) Write your own Metropolis-Hastings algorithm to obtain posterior draws for the model fitted in Question 2 of Assignment 2. You should ensure that you standardize the width predictor (we didn’t do this in Assignment 2). You should assume that β0 and β1 have normal priors with mean 0 and large standard deviation (small precision). Use a random walk normal proposal with tuning parameter κ0 = κ1 = 0.1 for β1. Hint: Write a function for the joint data model (likelihood). Question 2 (10 marks) The Mega Corporation Group, Ltd (or Megacorp) manages M = 12 hospitals. At each hospital, they perform a surgery that has a (fairly) high rate of complications (i.e. the proportion of surgeries where a patient exhibits an unwanted and potentially avoidable side effect such as infection). However, the number of surgeries performed at each hospital varies considerably, from n = 7 to n = 178. Megacorp wishes to estimate an overall complication rate across all of their hospitals, while still allowing for hospital-to-hospital variation in complication rate. Let µ be the overall complication rate across all of their hospitals, σ be the standard deviation between hospitals, and let pi be the complication rate at the ith hospital. Also, let pnew be the predicted rate at a new hospital they are currently building. They wish to model their data as follows: Yi ∼ Binomial(ni, pi), pi ∼ Beta(mean = µ, variance = σ2), µ ∼ Beta(α = 0.5, β = 0.5), and σ2 ∼ Uniform(0, µ(1− µ)). For pi, we use the alternative parameterisation for the Beta distribution based on its mean and variance. Relative to the standard Beta(α, β) parameterisation, α = µν and β = (1− µ)ν, where ν = µ(1− µ)/σ2 − 1. Data are in megacorp.txt, with the number of surgeries performed (n) and the number of complications (y, stored as comp) at each hospital. Note: a more traditional (and more flexible) analysis of this data be to use a generalised linear mixed model, e.g. based on logit(pi) ∼ Normal(µ, σ2). The reason for modelling the data as above is to develop skills working with alternative parameterisations and conditional priors. (a) (3 marks) Implement the model in jags. (b) (2 marks) Provide relevant output and convergence diagnostics. (c) (1 mark) Compare the estimated complication rate for each hospital to its MLE estimate, pˆMLE = y/n. What do you notice? (d) (1 mark) Does it seem that any of the hospitals perform much better or worse than others? (e) (3 marks) Describe the overall complication rate, level of hospital-to-hospital variation, and predicted complication rate for the new hospital. Use language aimed at an executive of the Megacorp who has little knowledge of statistics. Question 3 (4 marks) The final question is a question about presentation. For this assignment, 4 marks will be awarded for editing and presenting results in a clear manner. Specific details to consider: • Are you presenting too much or too little R code and output? E.g., the use of defaults in knitr often leads to overly verbose output. Use code chunk options like include=FALSE, eval=FALSE, echo=FALSE, warnings=FALSE, messages=FALSE, to reduce unwanted output. Alternatively, consider use of the verbatim environment or other Word-processing software. • Have you used text to clearly summarise your findings? Or, does the reader need to hunt through pages of code to find the results of interest? • Did you include your name or ID within the file and not just as part of the filename? Did you set your filename to the requested Lastname A3.pdf? • Did you read through your assignment, consider how to present it better, and then make edits accord- ingly? Page 2