---
title: "Quickstart: entropy balancing with `ebal`"
author: "Jens Hainmueller"
date: "`r format(Sys.Date(), '%B %Y')`"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 2
vignette: >
  %\VignetteIndexEntry{Quickstart: entropy balancing with ebal}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4.5,
  dpi = 96
)
set.seed(20260504)
```

## What is entropy balancing?

Entropy balancing (Hainmueller 2012) reweights a control sample so that
its covariate moments match the treated group's, with weights staying
as close as possible to a base distribution (uniform by default) in a
maximum-entropy sense. The output weights drop directly into a
`lm(..., weights = w)` call to estimate the average treatment effect on
the treated.

This vignette shows the package's user-facing API on a small toy
dataset.

```{r setup}
library(ebal)
```

## A minimal example

```{r toy}
set.seed(20260504)
n0 <- 200; n1 <- 100
X <- rbind(
  replicate(3, rnorm(n0, mean = 0)),       # controls
  replicate(3, rnorm(n1, mean = 0.5))      # treated, shifted
)
colnames(X) <- c("x1", "x2", "x3")
treatment <- c(rep(0, n0), rep(1, n1))
```

Pre-weighting, the control means differ from the treated means
markedly:

```{r raw-balance}
treated_means <- colMeans(X[treatment == 1, ])
control_means <- colMeans(X[treatment == 0, ])
rbind(treated = treated_means, control = control_means)
```

## Fit

```{r fit}
fit <- ebalance(treat ~ x1 + x2 + x3,
                data = data.frame(treat = treatment, X))
fit
```

Either the formula interface above, or the matrix interface
`ebalance(Treatment = treatment, X = X)`, works. Both produce an
`ebalance` object with print / summary / plot / weights methods.

## Tidy output

```{r tidy}
# tidy() / glance() / augment() are registered against the generics in
# the `generics` package, which `broom` re-exports. Loading either
# makes the methods discoverable.
library(generics)
tidy(fit)
glance(fit)
```

`tidy()` is a per-covariate balance table. `glance()` is a one-row
summary including the Kish effective sample size and convergence flag.

`weights(fit)` returns a length-`n` vector aligned to the original
data: treated units get weight 1, controls get the entropy-balancing
weight.

```{r weights-shape}
length(weights(fit))
range(weights(fit)[treatment == 0])
```

## Plotting

The base-graphics `plot(fit)` and the `ggplot2` `autoplot(fit)` both
produce a Love plot of standardized differences before vs. after
weighting:

```{r autoplot, eval = requireNamespace("ggplot2", quietly = TRUE)}
library(ggplot2)
autoplot(fit)
```

## Using the weights downstream

The natural drop-in for a weighted regression:

```{r weighted-lm, eval = FALSE}
df <- data.frame(treat = treatment, X, y = X[, 1] + 2 * treatment + rnorm(n0 + n1))
df$w <- weights(fit)
lm(y ~ treat, data = df, weights = w)
```

## Two solver methods

By default `ebalance()` uses Newton-Raphson on the dual problem (fast,
exact when the Hessian is well-conditioned). As of 0.3-0 you can also
use a torch-based autodiff solver (BFGS on gradients computed via
automatic differentiation) — contributed by Apoorva Lal:

```{r autodiff, eval = FALSE}
fit_ad <- ebalance(treat ~ x1 + x2 + x3,
                   data = data.frame(treat = treatment, X),
                   method = "autodiff")
```

The two methods produce equivalent weights (within solver tolerance).
Newton is faster on the small problems most users have; the autodiff
path is more stable when the optimization landscape is poorly
conditioned and scales better at large covariate counts. `torch` is in
`Suggests:`; the first call may require `torch::install_torch()` to
download libtorch.

## See also

- `?ebalance` for the long-form argument documentation.
- `?ebalance.trim` for trimmed weights when the base ebalance solution
  is too dispersed.
- `tidy()`, `glance()`, `augment()`, `autoplot()` — discoverable via
  `library(broom)` / `library(ggplot2)`.

## References

- Hainmueller, J. (2012). Entropy balancing for causal effects: A
  multivariate reweighting method to produce balanced samples in
  observational studies. *Political Analysis*, 20(1), 25–46.