# Theory


See the [paper](https://arxiv.org/abs/2605.03699) for a full treatment
of the theory.

## Setup

We consider the case of $\mathcal{T}$ periods. Let $D_{t} \in \{0, 1\}$
denote treatment status and $Z_{t} \in \{0, 1\}$ the instrument status.
Moreover, let $D = (D_{1}, D_{2}, \ldots, D_{\mathcal{T}})$ and
$Z = (Z_{1}, Z_{2}, \ldots, Z_{\mathcal{T}})$ denote the treatment and
instrument paths.

We assume staggered adoption of the instrument: $Z_{1} = 0$, and for
$t = 2, \ldots, \mathcal{T}$:

$$Z_{t-1} = 1 \implies  Z_{t} = 1,$$

i.e. no units are exposed to the instrument in the first period and all
units that are exposed in some period stay exposed.

This implies that the time period where the instrument switches on
characterizes the instrument path $Z$ completely. Because of this we
define the *cohort exposure variable* $E := \min \{t \mid Z_{t} = 1\}$
and $E_{e} := \mathbf{1}\{E = e\}$.

> [!NOTE]
>
> The staggered adoption of the instrument is analogous to the staggered
> adoption of the treatment variable $D_t$ in DiD, for instance in
> Callaway and Sant’Anna (2021) (CSA). Likewise, the cohort variables
> are are analogous to $G := \min \{t \mid D_{t} = 1\}$ and
> $G_{g} := \mathbf{1}\{G = g\}$ of CSA.

## Causal estimand

The main causal estimand is the cohort-specific time-varying local
average treatment effect on the treated:

$$LATT(e, t)
:= E[Y_{t}(1) - Y_{t}(0) \mid E_{e} = 1, D_{t}(e) > D_{t}(\infty)].$$

The paper develops the theory for two types of controls: never-exposed
and not-yet-exposed, analogous to the never treated and not-yet-treated
of CSA:

$$C^{nev} := \mathbf{1}\{E = \infty\},
\quad
C^{nye}_{e, s} := \mathbf{1}\{E_{e} = 0, Z_{s} = 0\}.$$

## Estimable estimands

### Panel Data

Under the identifying assumptions of the paper, a doubly robust
panel-data estimand for the $LATT(e,t)$ parameter is:

$$\tau^{dr, p}_{e, t}
= \frac{
E[
\{ w^{trt,p}_{e} - w^{c,p}_{e, t} \}
\{\Delta_{t-e+1}Y_{t} - m_{e, t}^{c, p}(X)\}
]
}{
E[
\{ w^{trt,p}_{e} - w^{c,p}_{e, t} \}
\{\Delta_{t-e+1}D_{t} - g_{e, t}^{c, p}(X)\}
]
}.$$

This is essentially a ratio of two $ATT_{dr}(g, t; 0)$ estimands of CSA
with the treatment variable replaced with the exposure variable $E_{e}$
and the denominator having outcome the treatment variable $D_{t}$.

### Repeated cross-sections

Likewise, a doubly robust repeated cross-section estimand for
$LATT(e,t)$ is

$$\tau^{dr, rc}_{e, t}
=
\frac{
E [
\{w^{trt,rc}_{e} - w^{c,rc}_{e}\}
\{Y - m_{e, Y}^{c,rc}(X)\}
] + \kappa_{e, t}^{Y, rc}
}{
E [
\{w^{trt,rc}_{e} - w^{c,rc}_{e}\}
\{D - g_{e, Y}^{c,rc}(X)\}
] + \kappa_{e, t}^{D, rc}
}.$$

See the paper for the definitions. Again, this is essentially a ratio of
two $ATT_{dr,rc}(g, t; 0)$ estimands of CSA with the treatment variable
replaced with the exposure variable $E_{e}$ and the denominator having
outcome the treatment variable $D$.

## Estimators

The estimators of the double robust estimands are plug-in estimators of
the doubly robust estimands above. The main public entry point is
{py:func}`idid.estimate`. Worked examples are given in the
[Quickstart](quickstart) page.

### Panel data

#### Doubly robust (`method="dr"`)

For panel data, the doubly robust estimator plugs in estimators of the
nuisance functions in $\tau^{dr,p}_{e,t}$. Operationally, the package
computes:

- a numerator estimator for the outcome change
- a denominator estimator for the treatment change
- the ratio of the two

and repeats for all pairs $(e, t)$, $e \in \mathcal{E}$, $t \geq e$.

This corresponds to:

``` python
import idid


res = idid.estimate(
    idid.sim_stag_panel(n=10_000, T=6, E_cohorts = [0, 2, 3, 4, 5]),
    cohort="E",
    time="t",
    outcome="Y_t",
    treatment="D_t",
    unit="id",
    covariates=["X"],
    control="never",
    method="dr",
    balanced=True,
    verbose=False,
)
res.summary()
```

    Cohort-Time Local Average Treatment Effects on the Treated:
     E   t    AET(e, t)   LATT(e, t)   Std. Error   [95% Pointwise.   Conf. Band]
     2   2       0.2531       0.9366       0.2070            0.5309        1.3422  *
     2   3       0.2346       1.2491       0.2239            0.8103        1.6879  *
     2   4       0.2083       1.2095       0.2473            0.7248        1.6942  *
     2   5       0.2050       1.2481       0.2618            0.7350        1.7612  *
     2   6       0.2102       1.3714       0.2497            0.8819        1.8609  *
     3   3       0.2200       1.2780       0.2443            0.7991        1.7568  *
     3   4       0.2354       1.0497       0.2242            0.6102        1.4891  *
     3   5       0.1912       1.2113       0.2774            0.6675        1.7550  *
     3   6       0.2358       1.0995       0.2244            0.6597        1.5394  *
     4   4       0.2221       0.8343       0.2356            0.3726        1.2960  *
     4   5       0.2139       1.0805       0.2467            0.5969        1.5640  *
     4   6       0.2509       1.0037       0.2120            0.5881        1.4193  *
     5   5       0.1992       1.0818       0.2718            0.5491        1.6144  *
     5   6       0.2543       1.1341       0.2112            0.7201        1.5481  *
    ---
    Signif. codes: `*' confidence band does not cover 0
    Control group: Never treated
    Estimation Method: Doubly Robust

The panel DR estimator supports custom nuisance choices through
`num_kwargs` and `den_kwargs`; see the {py:func}`idid.estimate` API.

#### Double machine learning (`method="dml"`)

For panel data, the DML estimator targets the same $LATT(e,t)$
parameter, but estimates the nuisance functions by cross-fitting
user-supplied machine learning models.

This corresponds to:

``` python
res = idid.estimate(
    data,
    cohort="E",
    time="t",
    outcome="Y_t",
    treatment="D_t",
    unit="id",
    covariates=["X"],
    control="never",
    method="dml",
    dml_kwargs={
        "nfolds": 5,
        "m_m": ...,
        "g_m": ...,
        "p_m": ...,
    },
    balanced=True,
)
```

See the panel DML examples in [Quickstart](quickstart#dml).

### Repeated cross-sections

#### Doubly robust (`method="dr"`)

For repeated cross-sections, the doubly robust estimator plugs in
nuisance estimators in $\tau^{dr,rc}_{e,t}$ and again forms a ratio
between the outcome and treatment components.

This corresponds to:

``` python
res = idid.estimate(
    idid.sim_stag_rc(n=20_000, T=6, E_cohorts = [0, 2, 3, 4, 5]),
    cohort="E",
    time="t",
    outcome="Y_t",
    treatment="D_t",
    unit="id",
    covariates=["X"],
    control="never",
    method="dr",
    balanced=False,
    verbose=False,
)
res.summary()
```

    Cohort-Time Local Average Treatment Effects on the Treated:
     E   t    AET(e, t)   LATT(e, t)   Std. Error   [95% Pointwise.   Conf. Band]
     2   2       0.2622       1.4060       0.4561            0.5121        2.3000  *
     2   3       0.2804       1.0196       0.4091            0.2177        1.8215  *
     2   4       0.2289       1.0392       0.5145            0.0308        2.0477  *
     2   5       0.3032       0.8324       0.3873            0.0733        1.5914  *
     2   6       0.2528       0.5638       0.4717           -0.3608        1.4884
     3   3       0.2025       0.1605       0.6104           -1.0358        1.3568
     3   4       0.1496      -0.8528       0.9244           -2.6646        0.9590
     3   5       0.2432      -0.2704       0.5246           -1.2986        0.7579
     3   6       0.2378       0.0119       0.5337           -1.0342        1.0579
     4   4       0.1907       1.3235       0.6397            0.0696        2.5773  *
     4   5       0.2415       0.3751       0.4976           -0.6003        1.3505
     4   6       0.2423       0.8244       0.4979           -0.1514        1.8003
     5   5       0.2248       0.5506       0.5454           -0.5185        1.6197
     5   6       0.2449       1.0097       0.5099            0.0102        2.0091  *
    ---
    Signif. codes: `*' confidence band does not cover 0
    Control group: Never treated
    Estimation Method: Doubly Robust

The repeated-cross-section DR estimator uses the same
{py:func}`idid.estimate` interface, with `balanced=False`.

#### Double machine learning (`method="dml"`)

For repeated cross-sections, the DML estimator cross-fits nuisance
models for the outcome, treatment, and exposure propensity components in
the repeated cross-section score.

This corresponds to:

``` python
res = idid.estimate(
    data,
    cohort="E",
    time="t",
    outcome="Y_t",
    treatment="D_t",
    unit="id",
    covariates=["X"],
    control="never",
    method="dml",
    dml_kwargs={
        "nfolds": 5,
        "m_m": ...,
        "g_m": ...,
        "p_m": ...,
    },
    balanced=False,
)
```

The DML kwargs are documented in the {py:func}`idid.estimate` API.