Difference in Differences Model
Introduction
Difference in Differences technique tries to apply experimental design on existing panel data(time spanned data) by assessing the differential effect between the targeted treatment group and the control group. It involves two indicator variables. The treatment variable signifies, in this case, whether the state legalized marijuana or not and the named post variable signifies the time whether legalization of marijuana is passed in legislation or not.
yit = α + β * treatment + γ * post + δ * (treatment * post) where yit = response variable treatment = indicator of whether the policy takes place or not post = time indicator that is 1 if time is after the policy is effective or 0 if time is before the policy is effective
By looking at the table below, the coefficient δ of the product of these two variables which is called the coefficient of the interaction variable actually gives the pure differential effect. Therefore, δ is the key to tell whether post-RML period in California creates higher percentage of DUI cases or not.
Before RML (post = 0) | After RML (post = 1) | Difference | |
---|---|---|---|
California (treatment =1) | α+β | α+β+γ+δ | γ+δ |
South Dakota (treatment = 0) | α | α+γ | γ |
Difference | β | β+δ | δ |
We will apply linear regression to the equation on the top to find out the value of δ. If δ is relatively large, then we would consider the increase in percentage of DUI cases after legalizing marijuana is due to Recreational Marijuana Legalization according to Difference in Differences technique.
Common Trend Assumption
Another thing to notice is that Difference in Differences technique relies on the assumption of common trends of the two states before the policy is implemented. Difference in Differences model would produce a robust result on the effectiveness of the policy’s if the patterns are similar.
In reality, our data does not show an extremely strong alignment with the common trend assumption. Indeed, there could be lots of different characteristics for both states. This could be one of the limitations for our findings since states actually differ so there won't be a perfect control group in this situation. We tried to transform both states' data so that they would eliminate the side effect as much as possible.
Detrend
We find out there is a strong seasonality pattern in South Dakota, while California does not necessarily have. Therefore, we tried to detrend both data so that these two groups would become more comparable.