ebalEntropy balancing (Hainmueller 2012) reweights a control sample so
that its covariate moments match the treated group’s, with weights
staying as close as possible to a base distribution (uniform by default)
in a maximum-entropy sense. The output weights drop directly into a
lm(..., weights = w) call to estimate the average treatment
effect on the treated.
This vignette shows the package’s user-facing API on a small toy dataset.
set.seed(20260504)
n0 <- 200; n1 <- 100
X <- rbind(
replicate(3, rnorm(n0, mean = 0)), # controls
replicate(3, rnorm(n1, mean = 0.5)) # treated, shifted
)
colnames(X) <- c("x1", "x2", "x3")
treatment <- c(rep(0, n0), rep(1, n1))Pre-weighting, the control means differ from the treated means markedly:
fit <- ebalance(treat ~ x1 + x2 + x3,
data = data.frame(treat = treatment, X))
#> Warning: ebalance() fit converged but is concentrated on a small number of units:
#> - control max/mean weight ratio = 10.3 > 10
#> Consider ebalance.trim(), tighter constraint.tolerance, or fewer moment constraints. See ?diagnostics. Suppress with options(ebal.warn_weak_fit = FALSE).
fit
#> Entropy balancing (estimand: ATT)
#> ---------------------------------
#> Treated: 100
#> Controls: 200 (reweighted; sum of weights = 100.514)
#> Moments: 3 covariate moment(s) balanced
#> Converged: TRUE (max moment deviation = 0.514)
#>
#> Use summary() for a balance table, weights() for the per-unit
#> weight vector, and plot() for a Love plot of standardized differences.Either the formula interface above, or the matrix interface
ebalance(Treatment = treatment, X = X), works. Both produce
an ebalance object with print / summary / plot / weights
methods.
# tidy() / glance() / augment() are registered against the generics in
# the `generics` package, which `broom` re-exports. Loading either
# makes the methods discoverable.
library(generics)
#>
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
tidy(fit)
#> term mean_treated_pre mean_treated_post mean_control_pre mean_control_post
#> 1 x1 0.4912183 0.4912183 -0.00708219 0.4897936
#> 2 x2 0.4640827 0.4640827 -0.08488470 0.4609244
#> 3 x3 0.5479527 0.5479527 -0.07201758 0.5451661
#> diff_pre diff_post std_diff_pre std_diff_post pct_reduction
#> 1 0.4983005 0.001424760 0.5032732 0.001438978 99.71408
#> 2 0.5489674 0.003158291 0.5676139 0.003265567 99.42469
#> 3 0.6199703 0.002786572 0.5544364 0.002492018 99.55053
glance(fit)
#> estimand n_treated n_control n_moments sum_weights_control
#> 1 ATT 100 200 3 100.5143
#> sum_weights_treated ess_control ess_treated max_weight_control
#> 1 100 76.6416 100 5.189638
#> max_weight_treated max_weight_ratio_control max_weight_ratio_treated
#> 1 1 10.32617 1
#> max_abs_std_diff_pre max_abs_std_diff_post maxdiff converged
#> 1 0.5676139 0.003265567 0.5142846 TRUEtidy() is a per-covariate balance table.
glance() is a one-row summary including the Kish effective
sample size and convergence flag.
weights(fit) returns a length-n vector
aligned to the original data: treated units get weight 1, controls get
the entropy-balancing weight.
The base-graphics plot(fit) and the ggplot2
autoplot(fit) both produce a Love plot of standardized
differences before vs. after weighting:
The natural drop-in for a weighted regression:
By default ebalance() uses Newton-Raphson on the dual
problem (fast, exact when the Hessian is well-conditioned). As of 0.3-0
you can also use a torch-based autodiff solver (BFGS on gradients
computed via automatic differentiation) — contributed by Apoorva
Lal:
fit_ad <- ebalance(treat ~ x1 + x2 + x3,
data = data.frame(treat = treatment, X),
method = "autodiff")The two methods produce equivalent weights (within solver tolerance).
Newton is faster on the small problems most users have; the autodiff
path is more stable when the optimization landscape is poorly
conditioned and scales better at large covariate counts.
torch is in Suggests:; the first call may
require torch::install_torch() to download libtorch.
?ebalance for the long-form argument
documentation.?ebalance.trim for trimmed weights when the base
ebalance solution is too dispersed.tidy(), glance(), augment(),
autoplot() — discoverable via library(broom) /
library(ggplot2).