Uplift Modeling with the Criteo Uplift Dataset

This notebook demonstrates uplift modeling using the Criteo Uplift Dataset. Uplift modeling (also known as CATE — Conditional Average Treatment Effect) aims to predict the incremental impact of an action (the “treatment”) on an individual’s behavioral outcome.

We will demonstrate how to use Perpetual’s causal inference tools:

  • UpliftBooster (R-Learner)

  • Meta-learners: SLearner, TLearner, XLearner, DRLearner (Doubly Robust)

[ ]:
import os

import matplotlib.pyplot as plt
import pandas as pd
import requests
from perpetual.causal_metrics import auuc, cumulative_gain_curve, qini_coefficient
from perpetual.meta_learners import DRLearner, SLearner, TLearner, XLearner
from perpetual.uplift import UpliftBooster
from sklearn.model_selection import train_test_split

# 1. Download Criteo Uplift Dataset
dataset_gz = "criteo-uplift-v2.1.csv.gz"

mirrors = [
    "http://go.criteo.net/criteo-research-uplift-v2.1.csv.gz",
    "https://criteostorage.blob.core.windows.net/criteo-research-datasets/criteo-uplift-v2.1.csv.gz",
]


def download_dataset(url, filename):
    print(f"Attempting to download from: {url}")
    try:
        response = requests.get(url, stream=True, timeout=10)
        response.raise_for_status()
        with open(filename, "wb") as f:
            for chunk in response.iter_content(chunk_size=8192):
                f.write(chunk)
        print("Download complete.")
        return True
    except Exception as e:
        print(f"Download failed from {url}: {e}")
        return False


if not os.path.exists(dataset_gz):
    success = False
    for url in mirrors:
        if download_dataset(url, dataset_gz):
            success = True
            break
    if not success:
        raise RuntimeError(
            "Download failed. Please download manually and place in this folder."
        )
else:
    print(f"Dataset '{dataset_gz}' already exists.")

# 2. Load and Preprocess
print("Loading dataset...")
df = pd.read_csv(dataset_gz, compression="gzip")

# --- PERFORMANCE OPTIMIZATION ---
# The full dataset has ~25M rows. We subsample to 50k rows for this tutorial
# to ensure all models fit in under 10 seconds.
SAMPLE_SIZE = 50_000
print(f"Subsampling to {SAMPLE_SIZE} rows for speed...")
df = df.sample(n=SAMPLE_SIZE, random_state=42).reset_index(drop=True)
# --------------------------------

y = df["conversion"].astype(int)
w = df["treatment"].astype(int)
features = [
    col
    for col in df.columns
    if col not in ["treatment", "conversion", "exposure", "visit"]
]
X = df[features].copy()

X_train, X_test, w_train, w_test, y_train, y_test = train_test_split(
    X, w, y, test_size=0.3, random_state=42
)
print(f"Training set size: {X_train.shape[0]}")
df.head()

2. R-Learner (UpliftBooster)

The UpliftBooster uses the R-Learner meta-algorithm. We use small budgets and an iteration_limit to keep the tutorial fast.

[ ]:
budget = 0.1
[ ]:
# Initialize and fit UpliftBooster
ub = UpliftBooster(
    outcome_budget=budget,
    propensity_budget=budget,
    effect_budget=budget,
)
ub.fit(X_train, w_train, y_train)

# Predicted Treatment Effect
uplift_r = ub.predict(X_test)
print(f"Average Predicted Uplift (R-Learner): {uplift_r.mean():.4f}")

2.1 Interaction Constraints

Perpetual allows you to enforce Interaction Constraints. This is useful when you know that certain features should only interact with each other, or should not interact at all.

[ ]:
# Enforce that feature 0 and feature 1 can interact
interaction_constraints = [[0, 1]]
ub_constrained = UpliftBooster(
    outcome_budget=budget,
    propensity_budget=budget,
    effect_budget=budget,
    interaction_constraints=interaction_constraints,
)
ub_constrained.fit(X_train, w_train, y_train)

uplift_constrained = ub_constrained.predict(X_test)
print(f"Average Uplift (Constrained): {uplift_constrained.mean():.4f}")

3. Comparing with Meta-Learners

Meta-learners are algorithms that decompose the causal problem into one or more supervised learning problems.

[ ]:
# S-Learner: Single model with treatment as feature
sl = SLearner(budget=budget)
sl.fit(X_train, w_train, y_train)
uplift_s = sl.predict(X_test)

# T-Learner: Two models (one per treatment group)
tl = TLearner(budget=budget)
tl.fit(X_train, w_train, y_train)
uplift_t = tl.predict(X_test)

# X-Learner: Two-stage learner with imputation
xl = XLearner(budget=budget)
xl.fit(X_train, w_train, y_train)
uplift_x = xl.predict(X_test)

# DR-Learner: Doubly Robust / AIPW
dr = DRLearner(budget=budget, clip=0.01)
dr.fit(X_train, w_train, y_train)
uplift_dr = dr.predict(X_test)

print(f"Avg Uplift S:  {uplift_s.mean():.4f}")
print(f"Avg Uplift T:  {uplift_t.mean():.4f}")
print(f"Avg Uplift X:  {uplift_x.mean():.4f}")
print(f"Avg Uplift DR: {uplift_dr.mean():.4f}")

4. Evaluation: Uplift Curve

Since we don’t know the “ground truth” individual effect, we use the Cumulative Gain (Uplift) curve to evaluate performance.

[ ]:
# --- Uplift Gain Curves ---
plt.figure(figsize=(10, 6))
for label, scores in [
    ("R-Learner", uplift_r),
    ("X-Learner", uplift_x),
    ("DR-Learner", uplift_dr),
]:
    fracs, gains = cumulative_gain_curve(y_test, w_test, scores)
    plt.plot(fracs, gains, label=label)

plt.plot([0, 1], [0, 0], "k--", label="Random")
plt.title("Cumulative Uplift Gain — Criteo Dataset (Subsampled)")
plt.xlabel("Population % Sorted by Predicted Uplift")
plt.ylabel("Cumulative Gain")
plt.legend()
plt.show()

# --- AUUC & Qini ---
for label, scores in [
    ("R-Learner", uplift_r),
    ("S-Learner", uplift_s),
    ("T-Learner", uplift_t),
    ("X-Learner", uplift_x),
    ("DR-Learner", uplift_dr),
]:
    a = auuc(y_test, w_test, scores, normalize=True)
    q = qini_coefficient(y_test, w_test, scores)
    print(f"{label:12s}  AUUC={a:+.4f}  Qini={q:+.4f}")