assuming \(i\) is the index for different city and \(t\) is the index for time. We are trying to see the relationship between house price \(HP\) and crime rate. \[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + V_t + \alpha_i + u_{i,t}\] where \(V_t\) is time dependent variable which represent the time trend across all cities, and \(\alpha_i\) is the city specific variable that is fixed across time (e.g. geographic, demographic, race, education). if \(\alpha_i\) is not in the OLS, then it would end up in the error term, and \(cov(Crime_{i,t}, \alpha) \neq0\) (a city’s crime rate can be related to its city specific variables). In this case the \(\beta\) is biased and inconsistent.
\[ \begin{aligned} \Delta HP_{i,t} &= \beta_0-\beta_0 + \beta_1 \Delta Crime_{i,t} + V_{t} - V_{t-1} + \alpha_i-\alpha_i + \Delta u_{i,t}\\ &= \beta_1 \Delta Crime_{i,t} + V_{t} - V_{t-1} + \Delta u_{i,t} \end{aligned} \]
Now the city’s specific effect has been removed, and \(\beta_1\) is consistent estimated this way.
\[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + \beta_2 Unemployment_{i,t} + \alpha_i + u_{i,t}\] The average of the house price for each city accross time is: \[\bar{HP_i} = \frac{1}{T} \sum_{t=1}^{T}HP_{i,t}\] \[\bar{\alpha_{i}} = \frac{1}{T}T\alpha_{i} = \alpha_i\]
Thus,
\[\bar{HP_i} = \beta_0 + \beta_1 \bar{Crime_i}+\beta_2 \bar{Unemployment_i}+\alpha_i+\bar{u_i}\]
Then
\[HP_{i,t}-\bar{HP_i} = \beta_1 (Crime_{i,t}-\bar{Crime_i}) + \beta_2 (Unemployment_{i,t}-\bar{Unemployment_i}) + (u_{i,t}-\bar{u_i})\]
Note that both First Difference and Fixed Effect can successfully estimate the coefficients for crime rate \(\beta_1\) & employment rate \(\beta_2\), it tells us nothing about the city specific characteristics variables \(\alpha_i\)
\[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + \beta_2 Unemployment_{i,t} + \alpha_i + u_{i,t}\]
\[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + \beta_2 Unemployment_{i,t} + \mu_1d_2 + \mu_2d_3 + u_{i,t}\] where \(d_2 = 1\) , when \(i=2\), otherwise \(d_2=0\). This way, we can actually estimate \(\alpha_i\), however, if we have larger number of different cities, then this approach becomes unrealistic.
\[\tilde{HP_{i,t}} = \beta_1\tilde{Crime_{i,t}}+\beta_2\tilde{Unemployment_{i,t}}+\tilde{u_{i,t}}\]
\(R^2\) here for fixed effect model means the variation of \(HP_{i,t}\) explained by the model relative to \(HP_i\).
For LSDV, the high \(R^2\) is not surprised and not very indicative. Even without the crime and unemployment rate, we still have dummy variables (city specific) and year in the model.
\[T=2: FD = FE\]
\[T \geq3 : FD \neq FE\]
Both \(\hat\beta_{FD}\) and \(\hat\beta_{FE}\) are unbiased, so we compare the efficiency.
\[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + \beta_2 Unemployment_{i,t} + \alpha_i + u_{i,t}\]
\(cov(u_{i,t}, u_{i,t-1}) = 0\) then \[ \begin{aligned} Cov(\Delta u_{i,t}, \Delta u_{i,t-1}) &= Cov(u_{i,t}-u_{i,t-1},u_{i,t-1}-u_{i,t-2}) \\ &= Cov(u_{i,t-1},u_{i,t-1}) \\ &= -Var(u_{i,t-1}) \end{aligned} \] Therefore, if we have serial uncorrelated idiosyncratic errors \(u_{i,t}\) then \(\Delta u_{i,t}\) is serially correlated, and therefore fixed effect is preferred. However, if we have serial correlated \(u_{i,t}\), then it depends on the \(\rho\) of \(u_{i,t} = \rho u_{i,t-1} + \epsilon_{i,t}\).
However, in both method, we don’t actually estimate the \(u_{i,t}\), so we don’t know if they are serially correlated. We can do use the estimation from first difference:
if \(Cov(\Delta \hat u_{i,t})\) is significant negative, then FE is favorable if \(Cov(\Delta \hat u_{i,t}) = 0\) then FD is favorable.
In the end, it is better to do both, and examine the difference of these two estimators.
If \(T>N\),where \(N\) is small, then FE is quite sensitive. If \(X_{i,t}\) is unit root, then FE is subject to spurious regression.
If \(T\) is very large, then FE is less sensitive than FD with respect to violation of strict exogeneity (\(Cov(u_{i,t}, X_{i,s})\))
\[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + \beta_2 Unemployment_{i,t} + \alpha_i + u_{i,t}\]
If \(Cov(\alpha_i, X_{i,t}) = 0\) then \(\beta_{OLS}\) is consistent, but we have serial correlation in the error terms. To address this, we do FGLS which is called the random effect for Panel Data.
\[Cov(\alpha_i+u_{i,t}, \alpha_i+u_{i,s}) = Var(\alpha_i)\]
\[HP_{i,t} - \lambda \bar{HP_{i,t}} = \beta_0(1-\lambda)+\beta_1 (Crime_{i,t}-\lambda Crime_{i,t}) + \beta_2 (Unemployment_{i,t} \lambda Unemployment_{i,t}) + \alpha_i (1-\lambda) + u_{i,t} (1-\lambda)\] \[\lambda = 1-\frac{\sigma_{u}^2}{\sigma_{u}^2+T \sigma_{\alpha}^2}^{1/2}\]
note that when \(\lambda = 1\), then it’s fixed effect.
Steps of feasible GLS (Random Effect):
We can only use random effect if \(Cov(\alpha_i, X_{i,t}) = 0\)
Benefits of random effects \(-\) time constant variables
\[HP_{i,t} = \beta_0+\beta_1 Crime_{i,t} + \beta_2 Unemployment_{i,t} + \beta_3 Geography_{i} + \beta_4 Race_{i}+ \alpha_i + u_{i,t}\]
Time constant variables are not possible to be estimated using FE or FD. Since \(\lambda\) lays between 0 and 1, and therefore the Time constant variables are not going to disappear in the transformed equation.
Hausman Test for Random Effects vs Fixed Effects.
Null Hypothesis \(H_0 = Cov(\alpha_i, X_{i,t}) = 0\) \[w = \frac{(\hat{\beta_{FE}^{*}}-\hat{\beta_{RE}^{*}})^2} {Var(\hat{\beta_{FE})}-Var(\hat{\beta_{RE})}} \sim \chi_1^2\] under \(H_0\)
Intuition, if Null Hypothesis is true, then