Details, details: it’s all about the details!
Ordinary Least Squares (OLS) is usually the first method every student learns as they embark on a journey of statistical euphoria. It’s a method that quite simply finds the line of best fit within a two dimensional dataset. Now the assumptions behind the model, along with the derivations are widely covered online, but what isn’t actively covered is the sampling distribution of the estimator itself.
The sampling distribution is important because it informs the researcher how accurate the estimator is for a given sample size, and more so, it allows us to determine how the estimator behaves as the number of data points increase.
To determine the behaviour of the sampling distribution, let’s first derive the expectation of the estimator itself.
Expectation of OLS Estimator
Remember that the OLS Coefficient is traditionally calculated as follows:
Where Y = XB + e. Substitute the equation of Y into the formulae above, and continue the derivation below:
Again, we know that an estimate of beta has a closed form solution, where if we replace y with xb+e, you start at the first line. Deriving out as we do, and remembering that E[e]=0, then we derive that our OLS estimator Beta is unbiased.
Variance of your OLS Estimator
Now that we have an understanding of the expectation of our estimator, let’s look at the variance of our estimator.
To get to the first line you have to remember that your sample estimator (beta hat) can be expanded and simplified as follows:
where e~N(0, σ²). From this, we can also determine that E[e’e]=σ², which is a constant and can therefore move out of the equation to leave the X’s which are all multiplied together, cancel each other out to just leave the inverse of the squared X.
Ultimately, this leaves σ²/(X’X) which is asymptotically 0 as if n increases substantially, then the variance of your OLS estimator goes to 0 as σ² remains the same but (X’X) would grow exponentially.
Now that we’ve characterised the mean and the variance of our sample estimator, we’re two-thirds of the way on determining the distribution of our OLS coefficient.
Remember that as part of the fundamental OLS assumptions, the errors in our regression equation should have a mean of zero, be stationary, and also be normally distributed: e~N(0, σ²). Remember that the OLS coefficient is simply a linear combination of these ‘disturbances’ and therefore, our OLS coefficient is therefore driven by these normal disturbances. Therefore:
And there we have it! I’ve (a) derived the expectation of the OLS estimator and shown how it is also unbiased. (b) I’ve derived the variance of the sample estimator and shown how it’s asymptotically actually 0. And (c), we use the intuition behind the distribution of the error term to infer the sampling distribution of our estimator. (Note that for sample sizes greater than around 30, the sampling distribution would be approximately normal anyways because of the Central Limit Theorem).
On the whole, I hope that the reader has a much deeper awareness and understanding of their beta coefficient. The information above can be used in a powerful way to make robust estimates of relationships: moreover showing the importance of increasing the number of samples to decrease the variance of your sample estimator.
Ultimately, the insights you gain from understanding fundamental details will shape the way you think when experimenting!
Thanks for reading and hope I helped! Please message me if you need any help!