Probabilaball: The Delta Method for a Confidence Interval for Odds

In my previous post, I discussed using Wald theory and maximum likelihood to get a confidence interval for a batting average,

$\theta$ . What if I want a function of that parameter instead?

Odds

Let's say that instead of a batting average

$\theta$ , I want the odds of getting a hit. To get the odds of a hit, apply the function

$g(\theta) = \dfrac{\theta}{1-\theta}$

So for example, a batter with a true

$\theta = 0.250$ will have odds

$g(0.250) = 0.250/0.750 = 1/3$ , or one to three odds of getting a hit.

Delta Method

Suppose we have some estimator

$\hat{\theta}$ that converges to a normal distribution with variance

$\sigma^2$ - that is,

$\hat{\theta} \rightarrow N(\theta, \sigma^2)$

For example, assuming independent and identical at-bats, the sample batting average converges to a normal distribution

$\hat{\theta} \rightarrow N\left(\theta, \dfrac{\theta(1-\theta)}{n}\right)$

then statistical theory says that any function

$g(\hat{\theta})$ , assuming the first derivative exists and is nonzero, has distribution

$g(\hat{\theta}) \rightarrow N(g(\theta), [g'(\theta)]^2 \sigma^2)$

This gives us a handy way to calculate confidence intervals for functions of parameters, if we can calculate a confidence interval for the parameter itself.

Back to Odds of Getting a Hit

If we define the odds function as above, then the first derivative is given by

$g'(\theta) = \dfrac{1}{(1-\theta)^2}$

and so the distribution of the sample batting odds

$g(\hat{\theta})$ converges to a normal distribution with mean

$g(\theta)$ and variance

$[g'(\theta)]^2 \sigma^2 = \left[\dfrac{1}{(1-\theta)^2}\right]^2\left[\dfrac{\theta(1-\theta)}{n}\right] = \dfrac{\theta}{n(1-\theta)^3}$

And so a confidence interval for the odds of a hit, given the sample batting average

$\hat{\theta}$ , is given by

$\left(\dfrac{\hat{\theta}}{1-\hat{\theta}}\right) \pm z^* \sqrt{\dfrac{\hat{\theta}}{n(1-\hat{\theta})^3}}$

where

$z^*$ is an appropriate quantile from the normal distribution.

Let's take our batter above, and suppose a

$\hat{\theta} = 0.250$ batting average in

$n = 40$ at-bats. Then a 95% confidence interval for the odds of getting a hit is given by

$\left(\dfrac{0.250}{1-0.250}\right) \pm1.96 \sqrt{\dfrac{0.250}{40(1 - 0.250)^3}} = (0.095, 0.572)$

A fairly wide interval - but then again,

$n = 40$ at-bats isn't much information to work on. If it were instead

$n = 400$ at-bats, the interval would be

$\left(\dfrac{0.250}{1-0.250}\right) \pm1.96 \sqrt{\dfrac{0.250}{400(1 - 0.250)^3}} = (0.258, 0.409)$

Which is much smaller, and much more useable.

The code I used to generate these results may be found on my github.

Probabilaball

20 June, 2015

The Delta Method for a Confidence Interval for Odds

Odds

Delta Method

Back to Odds of Getting a Hit

No comments:

Post a Comment