Risk, Compliance, and Interpretability

In highly regulated industries like FinTech, models must not only be accurate but also explainable and compliant with legal requirements.

This tutorial demonstrates Perpetual’s features for these use cases:

Adverse Action Codes using PerpetualRiskEngine.
Monotonicity Constraints for fair and logical behavior.

[ ]:

import numpy as np
from perpetual import PerpetualBooster, PerpetualRiskEngine
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

1. Load the German Credit Dataset

The German Credit dataset classifies people described by a set of attributes as good or bad credit risks.

[ ]:

print("Fetching German Credit dataset...")
data = fetch_openml(data_id=31, as_frame=True, parser="auto")
df = data.frame

# Target: class (1=Good, 2=Bad). We map to 0=Good, 1=Bad (probability of default)
y = (df["class"] == "bad").astype(int)
X = df.drop(columns=["class"])

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)
X_train.head()

2. Monotonicity Constraints

Monotonicity ensures that the model’s output changes in a specific direction as a feature increases. For example, a higher duration of a loan should generally increase the risk of default.

[ ]:

# 1: Increasing, -1: Decreasing
constraints = {
    "duration": 1,  # Longer duration -> higher risk
    "credit_amount": 1,  # Higher amount -> higher risk
    "age": -1,  # Older age -> lower risk (generally)
}

model = PerpetualBooster(
    objective="LogLoss", budget=0.5, monotone_constraints=constraints
)
model.fit(X_train, y_train)

print("Model fitted with monotonicity constraints.")

3. PerpetualRiskEngine (Reason Codes)

When a loan is rejected, we need to provide the “Top Reasons” for rejection. The PerpetualRiskEngine uses the internal tree paths of the booster to find which features contributed most to the high risk score of a specific applicant.

[ ]:

engine = PerpetualRiskEngine(model)

# Calculate probabilities for the test set
probs = model.predict_proba(X_test)

# Find a high-risk applicant (rejected)
high_risk_idx = np.where(probs[:, 1] > 0.5)[0][0]
applicant = X_test.iloc[[high_risk_idx]]

print(f"Analyzing applicant at index {high_risk_idx}...")
# If probs is 2D (n_samples, 2), use probs[high_risk_idx][1]; if 1D, use .item()
prob_default = (
    probs[high_risk_idx][1] if probs.ndim == 2 else probs[high_risk_idx].item()
)
print(f"Probability of Default: {prob_default:.4f}")

# Generate reason codes
# threshold: the baseline probability to compare against (e.g., average default rate or approval threshold)
reasons = engine.generate_reason_codes(
    applicant, threshold=0.2, rejection_direction="higher"
)

print("\nAdverse Action Reasons:")
for i, reason_list in enumerate(reasons):
    for r in reason_list:
        print(f"- {r}")