Calibration and Uncertainty Quantification ========================================== PerpetualBooster provides native, high-performance calibration methods for both regression (Prediction Intervals) and classification (Probability Calibration). The Fundamental Advantage: Post-Hoc Calibration ---------------------------------------------- Traditional gradient boosting frameworks often require expensive modifications to the training process to produce well-calibrated outputs. For example: - **Quantile Regression**: Requires retraining multiple models (one for each quantile). - **CV-based Calibration**: Requires K-fold cross-validation or nested cross-validation, increasing training time by a factor of K. - **Conformal Prediction Wrappers**: Often require splitting data and wrapping external models, leading to complexity. **PerpetualBooster** changes this paradigm by offering **post-hoc calibration**. You train your model *once* with standard settings (ensuring ``save_node_stats=True`` is set). You can then apply various calibration methods to the already-trained model using a small calibration set. This process is instantaneous, does not modify the underlying ensemble, and allows you to calibrate for **tens or even hundreds of alpha levels** in a single pass without any retraining overhead. Probability Calibration (Classification) ---------------------------------------- In classification, calibration ensures that the output probabilities reflect true frequencies. A well-calibrated model that predicts a 90% probability of a "fraudulent" transaction should indeed be correct 90% of the time. Perpetual utilizes the **Pool Adjacent Violators Algorithm (PAVA)** for **Isotonic Regression** natively in Rust. Available Methods ~~~~~~~~~~~~~~~~~ Perpetual allows you to drive the Isotonic calibration using different internal uncertainty scores: 1. **Conformal (Default)**: Uses raw probabilities to fit the Isotonic curve. This is the standard approach to probability calibration. 2. **WeightVariance / GRP / MinMax**: These methods use method-specific uncertainty scores (calculated from node statistics) to drive the Isotonic calibration. By weighting probabilities by the model's confidence in specific regions of the feature space, Perpetual can achieve even lower **Expected Calibration Error (ECE)**. Example ~~~~~~~ .. code-block:: python from perpetual import PerpetualBooster # 1. Train once model = PerpetualBooster(objective="LogLoss", save_node_stats=True) model.fit(X_train, y_train) # 2. Calibrate post-hoc (accepts a single alpha or a list of levels) model.calibrate(X_cal, y_cal, alpha=[0.01, 0.05, 0.1]) # 3. Predict well-calibrated probabilities probs = model.predict_proba(X_test, calibrated=True) # 4. Get prediction sets (Conformal Sets) for each alpha level # Returns a dict: {"0.05": [labels], "0.1": [labels]} sets = model.predict_sets(X_test) Uncertainty Quantification (Regression) --------------------------------------- For regression, Perpetual provides rigorous **Prediction Intervals**. Instead of a point estimate, you receive a range ``[lower, upper]`` that is guaranteed to contain the true value with a specific probability (e.g., 90%). Native Calibration Methods ~~~~~~~~~~~~~~~~~~~~~~~~~~ * **Conformal**: Implements a method similar to Split Conformal Prediction or CQR. It ensures conservative coverage on any unseen data distributed similarly to the calibration set. Unlike other methods, Conformal works even if ``save_node_stats=False``. * **MinMax**: A proprietary method that uses the range of target values observed in the leaves of the ensemble to drive local uncertainty. (Requires ``save_node_stats=True``). * **WeightVariance**: Scales intervals based on the standard deviation of fold weights within trees. It is particularly effective for heteroscedastic data where uncertainty varies across the feature space. (Requires ``save_node_stats=True``). * **GRP (Generalized Residual Percentiles)**: Uses log-odds percentiles and statistical spreads within trees to generate extremely efficient and narrow intervals that still respect the coverage guarantees. (Requires ``save_node_stats=True``). Example ~~~~~~~ .. code-block:: python # Define desired coverage (alpha=0.1 means 90% confidence) model.calibrate(X_cal, y_cal, alpha=[0.1, 0.2], method="GRP") # Get lower/upper bounds for all alpha levels intervals = model.predict_intervals(X_test) Raw Prediction Distributions ---------------------------- If you need the underlying distribution of predictions rather than calibrated intervals, you can use ``predict_distribution()``. This returns the raw, uncalibrated simulation values generated by randomly sampling the 5 internal leaf weights of the ensemble (requires ``save_node_stats=True``). .. code-block:: python # Returns an array of shape (n_samples, 100) dist = model.predict_distribution(X_test, n=100) Why Perpetual is Better ----------------------- 1. **Superior ECE**: In benchmarks against LightGBM and Scikit-Learn, Perpetual consistently delivers lower Expected Calibration Error, making it the preferred choice for risk assessment and financial modeling. 2. **Narrower Intervals**: Perpetual's internal methods (GRP, MinMax) often produce significantly narrower prediction intervals than standard conformal wrappers while maintaining the requested coverage. 3. **Rust Efficiency**: Calibration occurs at the C-layer speed, meaning thousands of calibration points can be processed in milliseconds. 4. **API Simplicity**: A single ``calibrate()`` method handles everything. The booster automatically detects the task (classification vs regression) and chooses the most appropriate internal engine. For Conformal calibration, you can also use the explicit ``calibrate_conformal()`` method if you need to provide the training set again. 5. **Multi-level Support**: You can pass a list of alpha levels to ``calibrate()`` and receive all corresponding intervals or sets in a single prediction call. Since the internal methods (GRP, MinMax, WeightVariance) are extremely efficient and don't require re-training, providing tens or many alphas comes with virtually no additional computational cost. Tutorials --------- For deep-dives and performance comparisons: * :doc:`tutorials/calibration/regression_calibration`: In-depth comparison of GRP vs Conformal vs Mapie. * :doc:`tutorials/calibration/classification_calibration`: Detailed ECE analysis and Reliability Diagrams comparing Perpetual, Sklearn, and LightGBM. Next Steps ---------- Always ensure that your calibration set is independent of your training set to avoid "over-confidence" in your calibration metrics. A 75/25 split of your non-test data is usually a good starting point.