Uplift Modeling with the Criteo Uplift Dataset
This notebook demonstrates uplift modeling using the Criteo Uplift Dataset. Uplift modeling (also known as CATE — Conditional Average Treatment Effect) aims to predict the incremental impact of an action (the “treatment”) on an individual’s behavioral outcome.
We will demonstrate how to use Perpetual’s causal inference tools:
UpliftBooster(R-Learner)Meta-learners:
SLearner,TLearner,XLearner,DRLearner(Doubly Robust)
[ ]:
import os
import matplotlib.pyplot as plt
import pandas as pd
import requests
from perpetual.causal_metrics import auuc, cumulative_gain_curve, qini_coefficient
from perpetual.meta_learners import DRLearner, SLearner, TLearner, XLearner
from perpetual.uplift import UpliftBooster
from sklearn.model_selection import train_test_split
# 1. Download Criteo Uplift Dataset
dataset_gz = "criteo-uplift-v2.1.csv.gz"
mirrors = [
"http://go.criteo.net/criteo-research-uplift-v2.1.csv.gz",
"https://criteostorage.blob.core.windows.net/criteo-research-datasets/criteo-uplift-v2.1.csv.gz",
]
def download_dataset(url, filename):
print(f"Attempting to download from: {url}")
try:
response = requests.get(url, stream=True, timeout=10)
response.raise_for_status()
with open(filename, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print("Download complete.")
return True
except Exception as e:
print(f"Download failed from {url}: {e}")
return False
if not os.path.exists(dataset_gz):
success = False
for url in mirrors:
if download_dataset(url, dataset_gz):
success = True
break
if not success:
raise RuntimeError(
"Download failed. Please download manually and place in this folder."
)
else:
print(f"Dataset '{dataset_gz}' already exists.")
# 2. Load and Preprocess
print("Loading dataset...")
df = pd.read_csv(dataset_gz, compression="gzip")
# --- PERFORMANCE OPTIMIZATION ---
# The full dataset has ~25M rows. We subsample to 50k rows for this tutorial
# to ensure all models fit in under 10 seconds.
SAMPLE_SIZE = 50_000
print(f"Subsampling to {SAMPLE_SIZE} rows for speed...")
df = df.sample(n=SAMPLE_SIZE, random_state=42).reset_index(drop=True)
# --------------------------------
y = df["conversion"].astype(int)
w = df["treatment"].astype(int)
features = [
col
for col in df.columns
if col not in ["treatment", "conversion", "exposure", "visit"]
]
X = df[features].copy()
X_train, X_test, w_train, w_test, y_train, y_test = train_test_split(
X, w, y, test_size=0.3, random_state=42
)
print(f"Training set size: {X_train.shape[0]}")
df.head()
2. R-Learner (UpliftBooster)
The UpliftBooster uses the R-Learner meta-algorithm. We use small budgets and an iteration_limit to keep the tutorial fast.
[ ]:
budget = 0.1
[ ]:
# Initialize and fit UpliftBooster
ub = UpliftBooster(
outcome_budget=budget,
propensity_budget=budget,
effect_budget=budget,
)
ub.fit(X_train, w_train, y_train)
# Predicted Treatment Effect
uplift_r = ub.predict(X_test)
print(f"Average Predicted Uplift (R-Learner): {uplift_r.mean():.4f}")
2.1 Interaction Constraints
Perpetual allows you to enforce Interaction Constraints. This is useful when you know that certain features should only interact with each other, or should not interact at all.
[ ]:
# Enforce that feature 0 and feature 1 can interact
interaction_constraints = [[0, 1]]
ub_constrained = UpliftBooster(
outcome_budget=budget,
propensity_budget=budget,
effect_budget=budget,
interaction_constraints=interaction_constraints,
)
ub_constrained.fit(X_train, w_train, y_train)
uplift_constrained = ub_constrained.predict(X_test)
print(f"Average Uplift (Constrained): {uplift_constrained.mean():.4f}")
3. Comparing with Meta-Learners
Meta-learners are algorithms that decompose the causal problem into one or more supervised learning problems.
[ ]:
# S-Learner: Single model with treatment as feature
sl = SLearner(budget=budget)
sl.fit(X_train, w_train, y_train)
uplift_s = sl.predict(X_test)
# T-Learner: Two models (one per treatment group)
tl = TLearner(budget=budget)
tl.fit(X_train, w_train, y_train)
uplift_t = tl.predict(X_test)
# X-Learner: Two-stage learner with imputation
xl = XLearner(budget=budget)
xl.fit(X_train, w_train, y_train)
uplift_x = xl.predict(X_test)
# DR-Learner: Doubly Robust / AIPW
dr = DRLearner(budget=budget, clip=0.01)
dr.fit(X_train, w_train, y_train)
uplift_dr = dr.predict(X_test)
print(f"Avg Uplift S: {uplift_s.mean():.4f}")
print(f"Avg Uplift T: {uplift_t.mean():.4f}")
print(f"Avg Uplift X: {uplift_x.mean():.4f}")
print(f"Avg Uplift DR: {uplift_dr.mean():.4f}")
4. Evaluation: Uplift Curve
Since we don’t know the “ground truth” individual effect, we use the Cumulative Gain (Uplift) curve to evaluate performance.
[ ]:
# --- Uplift Gain Curves ---
plt.figure(figsize=(10, 6))
for label, scores in [
("R-Learner", uplift_r),
("X-Learner", uplift_x),
("DR-Learner", uplift_dr),
]:
fracs, gains = cumulative_gain_curve(y_test, w_test, scores)
plt.plot(fracs, gains, label=label)
plt.plot([0, 1], [0, 0], "k--", label="Random")
plt.title("Cumulative Uplift Gain — Criteo Dataset (Subsampled)")
plt.xlabel("Population % Sorted by Predicted Uplift")
plt.ylabel("Cumulative Gain")
plt.legend()
plt.show()
# --- AUUC & Qini ---
for label, scores in [
("R-Learner", uplift_r),
("S-Learner", uplift_s),
("T-Learner", uplift_t),
("X-Learner", uplift_x),
("DR-Learner", uplift_dr),
]:
a = auuc(y_test, w_test, scores, normalize=True)
q = qini_coefficient(y_test, w_test, scores)
print(f"{label:12s} AUUC={a:+.4f} Qini={q:+.4f}")