- Count data
- Poisson model
- Over-dispersion
- Scaling
- Mixture models
- Multi-level models

- I use R mostly, so most examples will be in R!

.pull-left[ ]

.pull-right[
**Want to predict Y, using X:**
\[ y = \alpha + \beta x + \varepsilon\]
+ \(\alpha\)<U+F061> = Intercept
+ \(\beta\) = Coefficient (how much x affects y)
+ \(\varepsilon\)= Error (Residual)

- This is a linear regression, which has an exact solution. The solutions discussed in the following slides are more complicated as they are estimated, with no exact solution. ]

- Discrete
- E.g. 10 or 11 patients, but not 10.5

- Range from zero to infinity
- Usually can?t have a negative count

- Occur in a fixed time period, with a known average rate.
- Not normally distributed

- Poisson dist. assumes Mean = variance
- If variance > mean, Poisons model will underestimate the variance:
- SE & CIs too small, ?Significance? overstated

- Many mechanisms, including:
- Mis-specification (lack of predictors, poorly parameterised)
- Presence of outliers
- Variation between response probabilities

- Fit simple model and ignore OD
- Fit model, then use techniques to scale/estimate OD and correct
- Robust SE or Bootstrap

- Use a model that accounts for this:
- Scaled Poisson or related
- Complex variance structure

- Resampling with replacement ? (Efron 1979)
- Create sampling distribution of mean
- Handy, because this is normally distributed (if parametric bootstrap)
- R: ?boot? package or ? car::Boot ? is a convenience wrapper for glm.

- Funnel control limits at 2 & 3 s in left panel, inflated by additive scale factor t 2 in right panel

- Two distributions used
- ?Between? & ?Within? variance
- Commonly Negative Binomial (NB1)
- NB1 group-specific mean, multiplicative
- NB2 gamma/Poisson, quadratic

- Weight the mean differently, NB2 gives higher weight to smaller counts

- Sometimes a model structure has implicit levels
- Variance can be partitioned between levels:
- E.g. patients within GP practises submitting to a trial
- Patients followed up at several points over time

- Breaks the normal regression assumption of ?independence? and ?homoscedasticity
- Can lead to OD

- Principle Components / Factor Analysis
- Generalised Additive Models
- Tree-based models / Random Forests

- Smooth functions of variables:
- Lost of options for smoothers:
- Cubic Spline
- Thin-plate Splines
- Tensor products

- Need to estimate degree of smoothness and penalty term

- In R, most popular package mgcv (Wood, 2017)
- mgcv::gam(y ~ s(x, bs=?cr?))
- s() is a smoother construct

- Estimates smooth parameters as part of model
- Parametric terms can still be used
- Random Effects can be included if simple
- Called by gamm4 or gamm (using lme4 or nlme )

- Smooth functions often represent data better than raw values
- Requires choosing a smoother, R can estimate parameters including knots and penalty
- Can reduce overdispersion due to noise
- Can be heavy on degrees of freedom
- Need to be careful of over-fitting
- More complex regression mechanisms

- Combine tree-based methods with Bootstrapping
- Random sample of both data & parameters

- Pros and cons:
- Predict very well
- Less likely to over-fit
- Linearity or distribution not really an issue

- Hard to visualise/understand
- Not able to use random effects

- Definite clustering:
- Random-intercepts

- Collinearity / Noisy data
- Generalised Additive Models

- Marginal model ? use additive-OD model

- GOLDSTEIN, H. (2010). Multilevel Statistical Models , John Wiley & Sons Inc.
- GREVEN , S. & KNEIB, T. 2010. On the behaviour of marginal and conditional AIC in linear mixed models.
*Biometrika,*97**,**773-789. - HASTIE, T. & TIBSHIRANI, R. 1986. Generalized Additive Models. Statist. Sci. 1 no . 3, 297–310. doi:10.1214/ss/1177013604.
- HUBER , P. J. 1967 The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, Berkeley , Calif.: University of California Press, 221-233.
- MCCULLAGH, P. & NELDER, J. A. 1983.
*Generalized linear models,*London, Chapman & Hall . - NELDER , J. A. & WEDDERBURN, R. W. M. 1972. Generalized Linear Models.
*Journal of the Royal Statistical Society. Series A (General),*135**,**370-384 . - RABE-HESKETH, S. & SKRONDAL, A. 2012. Multilevel and Longitudinal Modeling Using Stata, Volumes I and II, Third Edition. 3rd ed.: Taylor & Francis .
- SPEIGELHALTER, D.J., 2005a. Funnel plots for comparing institutional performance.
*Stat Med,***24**(8), pp. 1185-1202 VER HOEF, J. M. & BOVENG, P. L. 2007. Quasi-Poisson vs. negative binomial regression: how should we model overdispersed count data?*Ecology,*88**,**2766-72 . - SPIEGELHALTER, D.J., 2005b. Handling over-dispersion of performance indicators.
*Qual Saf Health Care,***14**(5), pp. 347-351. - WOOD , S. N. 2017.
*Generalized Additive Models: An Introduction with R, Second Edition,*Florida, USA, CRC Press .