27 August, 2015

WHIP Stabilization by the Gamma-Poisson Model

I've previously covered shrinkage estimation for offensive statistics - or at least, those that can be written as binomial events. In a previous post, I showed that for models that follow the natural exponential family with quadratic variance function, the split-half correlation is equal to one minus the shrinkage coefficient $B$.

The techniques I used can also be used when the outcome at the most basic level (at-bat, inning pitched, etc.) is not just a binary outcome. In particular, the Poisson distribution also fits within the framework I derived, as it is a member of the natural exponential family with quadratic variance function, and so events that can be modeled as Poisson at the base level will follow the same basic principles I used for the binomial outcomes. I chose the statistic WHIP (walks + hits per inning pitched) to illustrate this method, as it is a counting statistic that is a non-binary event (i.e., you can have 0, 1, 2, ... walks + hits in a given inning), so it fits the support of the Poisson.

Model and Estimation


I will assume that in each inning, pitcher $i$ gives up a number of walks + hits that follows a Poisson model with mean $\theta_i$, which is unique to each pitcher. The sum number of walks + hits given up in $n_i$ innings is $x_i$, and I have $N$ pitchers total. I considered only starting pitchers from 2009-2014, and split years between the same pitcher. My code and data are on github for anybody who wants to check my calculations.

Since the sum of Poissons is Poisson, the sum number of walks + hits $x_i$ given up in $n_i$ innings follows a Poisson distribution with mean $n_i \theta_i$ and mass function

$p(x_i | \theta_i, n_i) =  \dfrac{e^{-n_i \theta_i} (n_i \theta_i)^{x_i}}{x_i !}$

I will also assume that the distribution of means $\theta_i$ follows a gamma distribution with mean $\mu$ and variance parameter $K$ (in this parametrization, $\mu = \alpha/\beta$ and $K = \beta$ as opposed to the traditional $\alpha, \beta$ parametrization). This distribution has density

$f(\theta_i | \mu, K) = \dfrac{K^{\mu K}}{\Gamma(\mu K)} \theta_i^{\mu K - 1} e^{-K \theta_i}$

As shown in a previous post, the split-half correlation is then one minus the shrinkage coefficient $B$, or

$\rho = 1 - B = \left(\dfrac{n_i}{n_i + K}\right)$

So once I have an estimate $\hat{K}$ and a desired stabilization level $p$, solving for $n$ gives

$\hat{n} = \left(\dfrac{p}{1-p}\right) \hat{K}$

Once again, the population variance parameter $K$ is equivalent to the 0.5 stabilization point - the point where the split half correlation should be exactly equal to 0.5, and also the point where the individual pitcher estimates are shrunk 50% of the way towards the mean.

For estimation of $mu$ and $K$, I used marginal maximum likelihood - a one dimensional introduction to maximum likelihood is given here. The marginal density of $\mu$ and $K$ is

$p(x_i | n_i, \mu, K) = \displaystyle \int_0^{\infty} \dfrac{K^{\mu K}n_i^{x_i}}{\Gamma(\mu K) x_i !} e^{-\theta_i (n_i + K)}  \theta_i^{x_i + \mu K - 1} d\theta_i = \dfrac{K^{\mu K}n_i^{x_i}}{\Gamma(\mu K) x_i !} \dfrac{\Gamma(x_i + \mu K)}{(n_i + K)^{x_i + \mu K}}$

And the log-likelihood (dropping terms that do not involve either $\mu$ or $K$) is given by

$\ell(\mu, K) =  N \mu K \log(K) - N \log(\Gamma(\mu K)) + \displaystyle \sum_{i = 1}^N \left[\log(\Gamma(x_i + \mu K)) - (x_i + \mu K) \log(n_i + K)\right]$

Once again, I wrote code to maximize this function in $R$ using a Newton-Raphson algorithm. I converted $K$ to $\phi = 1/(1 + K)$ in the equation above for estimation and then converted it back by $K = (1-\phi)/\phi$ after estimation was complete - the reason being that it makes the estimation procedure much more stable.

In performing this estimation, I had to make a choice of the minimum number of innings pitched (IP) in order to be included in the dataset. When performing a similar analysis for on-base percentage, I found that at around 300 PA, the population variance (and hence, the stabilization point) became roughly constant. Unfortunately, this is not true for starting pitchers.



The population variance in talent levels decreases consistently as a function of the minimum number of IP that are considered, and so the stabilization point $K$ increases. This means that, unlike OBP, for example, the stabilization point is always determined by what percentage of pitchers you look at (by IP) - if you look at only the top 50%, the stabilization point will be larger than the stabilization point for the top 70%.

This is reflected in the plot below - as with OBP and PA, the mean WHIP is associated with the number of IP,  but unlike with OBP, the variance around the mean is constantly changing with the mean.



For my calculation, I chose to use 80 innings pitched as my cutoff point - corresponding to approximately 15 games started and capturing slightly more than 50% of pitchers (by IP). This point was completely arbitrary, though, and other cutoffs will be equally valid depending on the question at hand.

Performing the estimation, the estimated league mean WHIP was $\hat{\mu} = 1.304$ with variance parameter $\hat{K} = 77.203$.



Once again,  95% confidence intervals for a specific stabilization level p are given as

$\left(\dfrac{p}{1-p}\right) \hat{K} \pm 1.96 \left(\dfrac{p}{1-p}\right) \sqrt{Var(\hat{k})}$

From (delta-method transformed) maximum likelihood output, $Var(\hat{K}) = 29.791$ (for a standard error of $5.459$ IP). The stabilization curve, with confidence bounds, is then


Aside from the model criticisms I've already mentioned, standard ones apply - innings pitched are not identical and independent (and treating them as so is clearly much worse than treating plate appearances as identical and independent), pitchers are not machines, etc. I don't think the model is great, but it is useful. It gives confidence bounds for the stabilization point something other methods don't do. As always, comments are appreciated.

No comments:

Post a Comment