{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Risk, Compliance, and Interpretability\n", "\n", "In highly regulated industries like FinTech, models must not only be accurate but also explainable and compliant with legal requirements. \n", "\n", "This tutorial demonstrates Perpetual's features for these use cases:\n", "* **Adverse Action Codes** using `PerpetualRiskEngine`.\n", "* **Monotonicity Constraints** for fair and logical behavior.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from perpetual import PerpetualBooster, PerpetualRiskEngine\n", "from sklearn.datasets import fetch_openml\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Load the German Credit Dataset\n", "\n", "The German Credit dataset classifies people described by a set of attributes as good or bad credit risks." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Fetching German Credit dataset...\")\n", "data = fetch_openml(data_id=31, as_frame=True, parser=\"auto\")\n", "df = data.frame\n", "\n", "# Target: class (1=Good, 2=Bad). We map to 0=Good, 1=Bad (probability of default)\n", "y = (df[\"class\"] == \"bad\").astype(int)\n", "X = df.drop(columns=[\"class\"])\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(\n", " X, y, test_size=0.2, random_state=42\n", ")\n", "X_train.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Monotonicity Constraints\n", "\n", "Monotonicity ensures that the model's output changes in a specific direction as a feature increases. For example, a higher duration of a loan should generally increase the risk of default." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# 1: Increasing, -1: Decreasing\n", "constraints = {\n", " \"duration\": 1, # Longer duration -> higher risk\n", " \"credit_amount\": 1, # Higher amount -> higher risk\n", " \"age\": -1, # Older age -> lower risk (generally)\n", "}\n", "\n", "model = PerpetualBooster(\n", " objective=\"LogLoss\", budget=0.5, monotone_constraints=constraints\n", ")\n", "model.fit(X_train, y_train)\n", "\n", "print(\"Model fitted with monotonicity constraints.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. PerpetualRiskEngine (Reason Codes)\n", "\n", "When a loan is rejected, we need to provide the \"Top Reasons\" for rejection. The `PerpetualRiskEngine` uses the internal tree paths of the booster to find which features contributed most to the high risk score of a specific applicant." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "engine = PerpetualRiskEngine(model)\n", "\n", "# Calculate probabilities for the test set\n", "probs = model.predict_proba(X_test)\n", "\n", "# Find a high-risk applicant (rejected)\n", "high_risk_idx = np.where(probs > 0.6)[0][0]\n", "applicant = X_test.iloc[[high_risk_idx]]\n", "\n", "print(f\"Analyzing applicant at index {high_risk_idx}...\")\n", "print(f\"Probability of Default: {probs[high_risk_idx]:.4f}\")\n", "\n", "# Generate reason codes\n", "# threshold: the baseline probability to compare against (e.g., average default rate or approval threshold)\n", "reasons = engine.generate_reason_codes(applicant, threshold=0.2)\n", "\n", "print(\"\\nAdverse Action Reasons:\")\n", "for i, reason_list in enumerate(reasons):\n", " for r in reason_list:\n", " print(f\"- {r}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 2 }