Skip to contents

In a recent working paper, MacKinnon, Nielsen and Webb(2022) suggest multiple new variants of the wild cluster bootstrap. The methods differ in a) how the bootstrap scores are computed and b) if a CRV1 or CRV3 variance matrix is used when computing bootstrapped t-statistics.

All new bootstrap variants are implemented in fwildclusterboot and can be called by using the bootstrap_type function argument of boottest(). The implementation is still quite bare-bone: it only allows to test hypotheses of the form \(\beta_k = 0\) vs \(\beta_k \neq 0\), does not allow for regression weights or fixed effects, and further does not compute confidence intervals.

Note that in a recent update of their working paper, MNW have renamed the new bootstrap types - type ‘11’ is now called ‘C’ (for classic), type ‘31’ is now called ‘S’ (for score), type ‘13’ is now called ‘V’ (for variance) and type ‘33’ is now called ‘B’ (for ‘both’). I will update this in fwildclusterboot once I find the time =)

To run all bootstrap types with the null hypothesis imposed on the bootstrap data generating process, you would have to specify the bootstrap_type function argument:

library(fwildclusterboot)
library(modelsummary)
options(modelsummary_factory_default = "gt")

N <- 1000
N_G1 <- 17
data <- fwildclusterboot:::create_data(
  N = N,
  N_G1 = N_G1,
  icc1 = 0.8,
  N_G2 = N_G1,
  icc2 = 0.8,
  numb_fe1 = 10,
  numb_fe2 = 5,
  seed = 41224,
  weights = 1:N / N
)

lm_fit <- lm(
  proposition_vote ~ treatment + log_income, 
  data = data
)

wcr_algos <- c("fnw11","11", "13", "31", "33")

run_all <- 
lapply(wcr_algos, function(x){  
  res <- 
    boottest(
      lm_fit, 
      param = ~treatment, 
      clustid = ~group_id1,
      B = 9999, 
      impose_null = TRUE,
      bootstrap_type = x
    )
  })
#> Warning: Please note that the seeding behavior for random number generation for
#> `boottest()` has changed with `fwildclusterboot` version 0.13.
#> 
#> It will no longer be possible to exactly reproduce results produced by versions
#> lower than 0.13.
#> 
#> If your prior results were produced under sufficiently many bootstrap
#> iterations, none of your conclusions will change. For more details about this
#> change, please read the notes in
#> [news.md](https://cran.r-project.org/web/packages/fwildclusterboot/news/news.html).
#> This warning is displayed once per session.

names(run_all) <- paste("WCR", 
                        c("11 F&W", "11 F&R", "13 F&R", "31 F&R", "33 F&R"))

msummary(
  run_all, 
  estimate = "{estimate} ({p.value})", 
  statistic = "[{conf.low}, {conf.high}]"
)
WCR 11 F&W WCR 11 F&R WCR 13 F&R WCR 31 F&R WCR 33 F&R
1*treatment = 0 0.006 (0.572) 0.006 (0.573) 0.006 (0.584) 0.006 (0.575) 0.006 (0.588)
[-0.016, 0.028]
Num.Obs. 1000 1000 1000 1000 1000
R2 0.022 0.022 0.022 0.022 0.022
R2 Adj. 0.020 0.020 0.020 0.020 0.020
AIC -150.6 -150.6 -150.6 -150.6 -150.6
BIC -131.0 -131.0 -131.0 -131.0 -131.0
Log.Lik. 79.309 79.309 79.309 79.309 79.309

Given the same seed = 123, both implementations of the 11 algorithm produce exactly the same p-values. The p-values produced by all other algorithms differ slightly, but are overall very close to each other. Confidence intervals are currently only implemented for the fnw11 algorithm.

Now, which bootstrap type should you run? MNW argue in favor of the “31” or “S” type. In the future, I hope that I’ll provide a more thorough discussion down here =)