API Reference

This page contains the detailed API reference for the Perpetual Python package.

PerpetualBooster

class perpetual.PerpetualBooster(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'LogLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, force_children_to_bound_parent: bool = False, missing: float = nan, allow_missing_splits: bool = True, create_missing_branch: bool = False, terminate_missing_features: Iterable[Any] | None = None, missing_node_treatment: str = 'None', log_iterations: int = 0, feature_importance_method: str = 'Gain', quantile: float | None = None, reset: bool | None = None, categorical_features: Iterable[int] | Iterable[str] | str | None = 'auto', timeout: float | None = None, iteration_limit: int | None = None, memory_limit: float | None = None, stopping_rounds: int | None = None, max_bin: int = 256, max_cat: int = 1000, interaction_constraints: List[List[int]] | None = None, save_node_stats: bool = False)[source]

Bases: object

Self-generalizing gradient boosted decision tree model.

PerpetualBooster automatically determines the best number of boosting rounds using a built-in generalization strategy, eliminating the need for manual early-stopping or cross-validation. It supports regression, binary and multi-class classification, and quantile objectives, as well as custom loss functions.

See also

perpetual.PerpetualBooster.__init__

Constructor with full parameter list.

metadata_attributes: Dict[str, BaseSerializer] = {'cat_mapping': <perpetual.serialize.ObjectSerializer object>, 'classes_': <perpetual.serialize.ObjectSerializer object>, 'feature_importance_method': <perpetual.serialize.ObjectSerializer object>, 'feature_names_in_': <perpetual.serialize.ObjectSerializer object>, 'n_features_': <perpetual.serialize.ObjectSerializer object>}

Metadata attributes that are persisted alongside the model and restored when a saved booster is loaded.

__init__(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'LogLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, force_children_to_bound_parent: bool = False, missing: float = nan, allow_missing_splits: bool = True, create_missing_branch: bool = False, terminate_missing_features: Iterable[Any] | None = None, missing_node_treatment: str = 'None', log_iterations: int = 0, feature_importance_method: str = 'Gain', quantile: float | None = None, reset: bool | None = None, categorical_features: Iterable[int] | Iterable[str] | str | None = 'auto', timeout: float | None = None, iteration_limit: int | None = None, memory_limit: float | None = None, stopping_rounds: int | None = None, max_bin: int = 256, max_cat: int = 1000, interaction_constraints: List[List[int]] | None = None, save_node_stats: bool = False)[source]

Gradient Boosting Machine with Perpetual Learning.

A self-generalizing gradient boosting machine that doesn’t need hyperparameter optimization. It automatically finds the best configuration based on the provided budget. The budget acts as a complexity control: a higher budget allows for more trees and potentially better fit, while a lower budget ensures faster training and better generalization for simpler datasets.

Parameters:
  • objective (str or tuple, default="LogLoss") –

    Learning objective function to be used for optimization. Valid options are:

    • ”LogLoss”: logistic loss for binary classification.

    • ”BrierLoss”: Brier score loss for probabilistic binary classification.

    • ”HingeLoss”: hinge loss for binary classification.

    • ”SquaredLoss”: squared error for regression.

    • ”QuantileLoss”: quantile error for quantile regression.

    • ”HuberLoss”: Huber loss for robust regression.

    • ”AdaptiveHuberLoss”: adaptive Huber loss for robust regression.

    • ”FairLoss”: Fair loss for robust regression.

    • ”AbsoluteLoss”: absolute (L1) error for regression.

    • ”SquaredLogLoss”: squared log error for regression.

    • ”MapeLoss”: mean absolute percentage error for regression.

    • ”PoissonLoss”: Poisson regression for count data.

    • ”GammaLoss”: Gamma regression with log-link.

    • ”TweedieLoss”: Tweedie regression with log-link.

    • ”CrossEntropyLoss”: cross-entropy loss for targets in [0, 1].

    • ”CrossEntropyLambdaLoss”: alternative weighted cross-entropy.

    • ”ListNetLoss”: ListNet loss for ranking.

    • custom objective: a tuple of (loss, gradient, initial_value) functions. Each function should have the following signature:

      • loss(y, pred, weight, group) : returns the loss value for each sample.

      • gradient(y, pred, weight, group) : returns a tuple of (gradient, hessian). If the hessian is constant (e.g., 1.0 for SquaredLoss), return None to improve performance.

      • initial_value(y, weight, group) : returns the initial value for the booster.

  • budget (float, default=0.5) – A positive number for fitting budget. Increasing this number will more likely result in more boosting rounds and increased predictive power.

  • num_threads (int, optional) – Number of threads to be used during training and prediction.

  • monotone_constraints (dict, optional) – Keys are feature indices or names, values are -1, 1, or 0.

  • force_children_to_bound_parent (bool, default=False) – Whether to restrict children nodes to be within the parent’s range.

  • save_node_stats (bool, default=False) – Whether to save node statistics (required for calibration).

  • missing (float, default=np.nan) – Value to consider as missing data.

  • allow_missing_splits (bool, default=True) – Whether to allow splits that separate missing from non-missing values.

  • create_missing_branch (bool, default=False) – Whether to create a separate branch for missing values (ternary trees).

  • terminate_missing_features (iterable, optional) – Features for which missing branches will always be terminated if create_missing_branch is True.

  • missing_node_treatment (str, default="None") – How to handle weights for missing nodes if create_missing_branch is True. Options: “None”, “AssignToParent”, “AverageLeafWeight”, “AverageNodeWeight”.

  • log_iterations (int, default=0) – Logging frequency (every N iterations). 0 disables logging.

  • feature_importance_method (str, default="Gain") – Method for calculating feature importance. Options: “Gain”, “Weight”, “Cover”, “TotalGain”, “TotalCover”.

  • quantile (float, optional) – Target quantile for quantile regression (objective=”QuantileLoss”).

  • reset (bool, optional) – Whether to reset the model or continue training on subsequent calls to fit.

  • categorical_features (str or iterable, default="auto") – Feature indices or names to treat as categorical.

  • timeout (float, optional) – Time limit for fitting in seconds.

  • iteration_limit (int, optional) – Maximum number of boosting iterations.

  • memory_limit (float, optional) – Memory limit for training in GB.

  • stopping_rounds (int, optional) – Early stopping rounds.

  • max_bin (int, default=256) – Maximum number of bins for feature discretization.

  • max_cat (int, default=1000) – Maximum unique categories before a feature is treated as numerical.

  • interaction_constraints (list of list of int, optional) – Interaction constraints.

  • save_node_stats

fit(X, y, sample_weight=None, group=None) Self[source]

Fit the gradient booster on a provided dataset.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data. Can be a Polars or Pandas DataFrame, or a 2D Numpy array. Polars DataFrames use a zero-copy columnar path for efficiency.

  • y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.

  • sample_weight (array-like of shape (n_samples,), optional) – Individual weights for each sample. If None, all samples are weighted equally.

  • group (array-like, optional) – Group labels for ranking objectives.

Returns:

self – Returns self.

Return type:

object

prune(X, y, sample_weight=None, group=None) Self[source]

Prune the gradient booster on a provided dataset.

This removes nodes that do not contribute to a reduction in loss on the provided validation set.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Validation data.

  • y (array-like of shape (n_samples,)) – Validation targets.

  • sample_weight (array-like of shape (n_samples,), optional) – Weights for validation samples.

  • group (array-like, optional) – Group labels for ranking objectives.

Returns:

self – Returns self.

Return type:

object

calibrate(X_cal: Any, y_cal: Any, alpha: float | Iterable[float], method: str | None = None) Self[source]

Calibrate the gradient booster for prediction intervals using a selected method.

Parameters:
  • X_cal (array-like) – Independent calibration dataset.

  • y_cal (array-like) – Targets for calibration data.

  • alpha (float or array-like) – Significance level(s) for the intervals (1 - coverage).

  • method (str, optional) – Calibration method to use. Options are “MinMax”, “GRP”, “WeightVariance”. If None, defaults to “WeightVariance”.

Returns:

self – Returns self.

Return type:

object

calibrate_conformal(X: Any, y: Any, X_cal: Any, y_cal: Any, alpha: float | Iterable[float], sample_weight: Any | None = None, group: Any | None = None) Self[source]

Calibrate the gradient booster using Conformal Prediction (CQR).

Parameters:
  • X (array-like) – Independent training dataset.

  • y (array-like) – Targets for training data.

  • X_cal (array-like) – Independent calibration dataset.

  • y_cal (array-like) – Targets for calibration data.

  • alpha (float or array-like) – Significance level(s) for the intervals (1 - coverage).

  • sample_weight (array-like, optional) – Weights for training data.

  • group (array-like, optional) – Group IDs for training data.

Returns:

self – Returns self.

Return type:

object

predict_intervals(X, parallel: bool | None = None) dict[source]

Predict intervals with the fitted booster on new data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – New data for prediction.

  • parallel (bool, optional) – Whether to run prediction in parallel. If None, uses class default.

Returns:

intervals – A dictionary containing lower and upper bounds for the specified alpha levels.

Return type:

dict

predict_sets(X, parallel: bool | None = None) dict[source]

Predict sets with the fitted booster on new data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – New data for prediction.

  • parallel (bool, optional) – Whether to run prediction in parallel. If None, uses class default.

Returns:

sets – A dictionary containing prediction sets for the specified alpha levels. Each set is a list of labels (e.g., [1.0], [0.0], or [0.0, 1.0]).

Return type:

dict

predict_distribution(X, n: int = 100, parallel: bool | None = None) ndarray[source]

Predict a distribution using uncalibrated leaf weights from internal nodes.

Generates n predictions for each sample by randomly sampling one of the 5 weights stored in each leaf node. This returns a raw, uncalibrated distribution of predictions.

Note: This method is only available if the booster was fitted with save_node_stats=True.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – New data for prediction.

  • n (int, default=100) – Number of simulations/predictions to generate for each sample.

  • parallel (bool, optional) – Whether to run prediction in parallel. If None, uses class default.

Returns:

distribution – A 2D array where each row contains n predictions for the corresponding sample.

Return type:

ndarray of shape (n_samples, n)

predict(X, parallel: bool | None = None) ndarray[source]

Predict with the fitted booster on new data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input features.

  • parallel (bool, optional) – Whether to run prediction in parallel.

Returns:

predictions – The predicted values (log-odds for classification, raw values for regression).

Return type:

ndarray of shape (n_samples,)

predict_proba(X, parallel: bool | None = None, calibrated: bool = False) ndarray[source]

Predict class probabilities with the fitted booster on new data.

Only valid for classification tasks.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input features.

  • parallel (bool, optional) – Whether to run prediction in parallel.

  • calibrated (bool, default=False) – Whether to return calibrated probabilities (requires calibration).

Returns:

probabilities – The class probabilities.

Return type:

ndarray of shape (n_samples, n_classes)

calculate_drift(X, drift_type: str = 'data', parallel: bool | None = None) float[source]

Calculate drift metrics (data or concept) for the model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – New data to evaluate for drift.

  • drift_type (str, default="data") –

    Type of drift to calculate. Options:

    • ”data”: Multivariate data drift across all tree nodes.

    • ”concept”: Concept drift focusing on nodes that are parents of leaves.

  • parallel (bool, optional) – Whether to run prediction in parallel. If None, uses class default.

Returns:

drift_score – The calculated drift score (average Chi-squared statistic).

Return type:

float

predict_log_proba(X, parallel: bool | None = None) ndarray[source]

Predict class log-probabilities with the fitted booster on new data.

Only valid for classification tasks.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input features.

  • parallel (bool, optional) – Whether to run prediction in parallel.

Returns:

log_probabilities – The log-probabilities of each class.

Return type:

ndarray of shape (n_samples, n_classes)

predict_nodes(X, parallel: bool | None = None) List[source]

Predict leaf node indices with the fitted booster on new data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input features.

  • parallel (bool, optional) – Whether to run prediction in parallel.

Returns:

node_indices – A list where each element corresponds to a tree and contains node indices for each sample.

Return type:

list of ndarray

property feature_importances_: ndarray
predict_contributions(X, method: str = 'Average', parallel: bool | None = None) ndarray[source]

Predict feature contributions (SHAP-like values) for new data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input features.

  • method (str, default="Average") –

    Method to calculate contributions. Options:

    • ”Average”: Internal node averages.

    • ”Shapley”: Exact tree SHAP values.

    • ”Weight”: Saabas-style leaf weights.

    • ”BranchDifference”: Difference between chosen and other branch.

    • ”MidpointDifference”: Weighted difference between branches.

    • ”ModeDifference”: Difference from the most frequent node.

    • ”ProbabilityChange”: Change in probability (LogLoss only).

  • parallel (bool, optional) – Whether to run prediction in parallel.

Returns:

contributions – The contribution of each feature to the prediction. The last column is the bias term.

Return type:

ndarray of shape (n_samples, n_features + 1)

partial_dependence(X, feature: str | int, samples: int | None = 100, exclude_missing: bool = True, percentile_bounds: Tuple[float, float] = (0.2, 0.98)) ndarray[source]

Calculate the partial dependence values of a feature.

For each unique value of the feature, this gives the estimate of the predicted value for that feature, with the effects of all other features averaged out.

Parameters:
  • X (array-like) – Data used to calculate partial dependence. Should be the same format as passed to fit().

  • feature (str or int) – The feature for which to calculate partial dependence.

  • samples (int, optional, default=100) – Number of evenly spaced samples to select. If None, all unique values are used.

  • exclude_missing (bool, optional, default=True) – Whether to exclude missing values from the calculation.

  • percentile_bounds (tuple of float, optional, default=(0.2, 0.98)) – Lower and upper percentiles for sample selection.

Returns:

pd_values – The first column contains the feature values, and the second column contains the partial dependence values.

Return type:

ndarray of shape (n_samples, 2)

Examples

>>> import matplotlib.pyplot as plt
>>> pd_values = model.partial_dependence(X, feature="age")
>>> plt.plot(pd_values[:, 0], pd_values[:, 1])
calculate_feature_importance(method: str = 'Gain', normalize: bool = True) Dict[int, float] | Dict[str, float][source]

Calculate feature importance for the model.

Parameters:
  • method (str, optional, default="Gain") –

    Importance method. Options:

    • ”Weight”: Number of times a feature is used in splits.

    • ”Gain”: Average improvement in loss brought by a feature.

    • ”Cover”: Average number of samples affected by splits on a feature.

    • ”TotalGain”: Total improvement in loss brought by a feature.

    • ”TotalCover”: Total number of samples affected by splits on a feature.

  • normalize (bool, optional, default=True) – Whether to normalize importance scores to sum to 1.

Returns:

importance – A dictionary mapping feature names (or indices) to importance scores.

Return type:

dict

text_dump() List[str][source]

Return the booster model in a human-readable text format.

Returns:

dump – A list where each string represents a tree in the ensemble.

Return type:

list of str

json_dump() str[source]

Return the booster model in JSON format.

Returns:

dump – The JSON representation of the model.

Return type:

str

classmethod load_booster(path: str) Self[source]

Load a booster model from a file.

Parameters:

path (str) – Path to the saved booster (JSON format).

Returns:

model – The loaded booster object.

Return type:

PerpetualBooster

save_booster(path: str)[source]

Save the booster model to a file.

The model is saved in a JSON-based format.

Parameters:

path (str) – Path where the model will be saved.

save_model() bytes[source]

Save the model to a bytes object.

Returns:

data – The serialized model data.

Return type:

bytes

classmethod load_model(data: bytes) Self[source]

Load a model from a bytes object.

Parameters:

data (bytes) – The serialized model data.

Returns:

model – The loaded booster object.

Return type:

PerpetualBooster

classmethod from_json(json_str: str) Self[source]

Load a booster model from a JSON string.

Parameters:

json_str (str) – The JSON representation of the model.

Returns:

model – The loaded booster object.

Return type:

PerpetualBooster

property is_fitted: bool

Whether the booster has been fitted.

Returns:

fitted – True if the booster is fitted, False otherwise.

Return type:

bool

insert_metadata(key: str, value: str)[source]

Insert metadata into the model.

Metadata is saved alongside the model and can be retrieved later.

Parameters:
  • key (str) – The key for the metadata item.

  • value (str) – The value for the metadata item.

get_metadata(key: str) str[source]

Get metadata associated with a given key.

Parameters:

key (str) – The key to look up in the metadata.

Returns:

value – The value associated with the key.

Return type:

str

property base_score: float | Iterable[float]

The base score(s) of the model.

Returns:

score – The initial prediction value(s) of the model.

Return type:

float or iterable of float

property number_of_trees: int | Iterable[int]

The number of trees in the ensemble.

Returns:

n_trees – Total number of trees.

Return type:

int or iterable of int

get_params(deep=True) Dict[str, Any][source]

Get parameters for this booster.

Parameters:

deep (bool, default=True) – Currently ignored, exists for scikit-learn compatibility.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_params(**params: Any) Self[source]

Set parameters for this booster.

Parameters:

**params (dict) – Booster parameters.

Returns:

self – Returns self.

Return type:

object

get_node_lists(map_features_names: bool = True) List[List[Node]][source]

Return tree structures as lists of node objects.

Parameters:

map_features_names (bool, default=True) – Whether to use feature names instead of indices.

Returns:

trees – Each inner list represents a tree.

Return type:

list of list of Node

trees_to_dataframe() Any[source]

Return the tree structures as a DataFrame.

Returns:

df – A Polars or Pandas DataFrame containing tree information.

Return type:

DataFrame

save_as_xgboost(path: str)[source]

Save the model in XGBoost JSON format.

Parameters:

path (str) – The path where the XGBoost-compatible model will be saved.

save_as_onnx(path: str, name: str = 'perpetual_model')[source]

Save the model in ONNX format.

Parameters:
  • path (str) – The path where the ONNX model will be saved.

  • name (str, optional, default="perpetual_model") – The name of the graph in the exported model.

Sklearn Interface

class perpetual.sklearn.PerpetualClassifier(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'LogLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, max_bin: int = 256, max_cat: int = 1000, save_node_stats: bool = False, **kwargs)[source]

Bases: PerpetualBooster, ClassifierMixin

A scikit-learn compatible classifier based on PerpetualBooster. Uses ‘LogLoss’ as the default objective.

__init__(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'LogLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, max_bin: int = 256, max_cat: int = 1000, save_node_stats: bool = False, **kwargs)[source]

Gradient Boosting Machine with Perpetual Learning.

A self-generalizing gradient boosting machine that doesn’t need hyperparameter optimization. It automatically finds the best configuration based on the provided budget.

Parameters:
  • objective (str or tuple, default="LogLoss") –

    Learning objective function to be used for optimization. Valid options are:

    • ”LogLoss”: logistic loss for binary classification.

    • custom objective: a tuple of (loss, gradient, initial_value) functions. Each function should have the following signature:

      • loss(y, pred, weight, group) : returns the loss value for each sample.

      • gradient(y, pred, weight, group) : returns a tuple of (gradient, hessian). If the hessian is constant (e.g., 1.0 for SquaredLoss), return None to improve performance.

      • initial_value(y, weight, group) : returns the initial value for the booster.

  • budget (float, default=0.5) – A positive number for fitting budget. Increasing this number will more likely result in more boosting rounds and increased predictive power.

  • num_threads (int, optional) – Number of threads to be used during training and prediction.

  • monotone_constraints (dict, optional) – Constraints to enforce a specific relationship between features and target. Keys are feature indices or names, values are -1, 1, or 0.

  • force_children_to_bound_parent (bool, default=False) – Whether to restrict children nodes to be within the parent’s range.

  • missing (float, default=np.nan) – Value to consider as missing data.

  • allow_missing_splits (bool, default=True) – Whether to allow splits that separate missing from non-missing values.

  • create_missing_branch (bool, default=False) – Whether to create a separate branch for missing values (ternary trees).

  • terminate_missing_features (iterable, optional) – Features for which missing branches will always be terminated if create_missing_branch is True.

  • missing_node_treatment (str, default="None") – How to handle weights for missing nodes if create_missing_branch is True. Options: “None”, “AssignToParent”, “AverageLeafWeight”, “AverageNodeWeight”.

  • log_iterations (int, default=0) – Logging frequency (every N iterations). 0 disables logging.

  • feature_importance_method (str, default="Gain") – Method for calculating feature importance. Options: “Gain”, “Weight”, “Cover”, “TotalGain”, “TotalCover”.

  • quantile (float, optional) – Target quantile for quantile regression (objective=”QuantileLoss”).

  • reset (bool, optional) – Whether to reset the model or continue training on subsequent calls to fit.

  • categorical_features (str or iterable, default="auto") – Feature indices or names to treat as categorical.

  • timeout (float, optional) – Time limit for fitting in seconds.

  • iteration_limit (int, optional) – Maximum number of boosting iterations.

  • memory_limit (float, optional) – Memory limit for training in GB.

  • stopping_rounds (int, optional) – Early stopping rounds.

  • max_bin (int, default=256) – Maximum number of bins for feature discretization.

  • max_cat (int, default=1000) – Maximum unique categories before a feature is treated as numerical.

  • interaction_constraints (list of list of int, optional) – Interaction constraints.

  • **kwargs – Arbitrary keyword arguments to be passed to the base class.

score(X, y, sample_weight=None)[source]

Return the mean accuracy on the given test data and labels.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,)) – True labels.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Returns:

Mean accuracy of self.predict(X) w.r.t. y.

Return type:

float

fit(X, y, sample_weight=None, **fit_params) Self[source]

Fit the classifier on training data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training input samples.

  • y (array-like of shape (n_samples,)) – Target class labels.

  • sample_weight (array-like of shape (n_samples,), optional) – Individual weights for each sample.

  • **fit_params – Additional keyword arguments forwarded to the base fit.

Returns:

Fitted estimator.

Return type:

self

class perpetual.sklearn.PerpetualRegressor(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'SquaredLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, max_bin: int = 256, max_cat: int = 1000, save_node_stats: bool = False, **kwargs)[source]

Bases: PerpetualBooster, RegressorMixin

A scikit-learn compatible regressor based on PerpetualBooster. Uses ‘SquaredLoss’ as the default objective.

__init__(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'SquaredLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, max_bin: int = 256, max_cat: int = 1000, save_node_stats: bool = False, **kwargs)[source]

Gradient Boosting Machine with Perpetual Learning.

A self-generalizing gradient boosting machine that doesn’t need hyperparameter optimization. It automatically finds the best configuration based on the provided budget.

Parameters:
  • objective (str or tuple, default="SquaredLoss") –

    Learning objective function to be used for optimization. Valid options are:

    • ”SquaredLoss”: squared error for regression.

    • ”QuantileLoss”: quantile error for quantile regression.

    • ”HuberLoss”: Huber loss for robust regression.

    • ”AdaptiveHuberLoss”: adaptive Huber loss for robust regression.

    • custom objective: a tuple of (loss, gradient, initial_value) functions. Each function should have the following signature:

      • loss(y, pred, weight, group) : returns the loss value for each sample.

      • gradient(y, pred, weight, group) : returns a tuple of (gradient, hessian). If the hessian is constant (e.g., 1.0 for SquaredLoss), return None to improve performance.

      • initial_value(y, weight, group) : returns the initial value for the booster.

  • budget (float, default=0.5) – A positive number for fitting budget. Increasing this number will more likely result in more boosting rounds and increased predictive power.

  • num_threads (int, optional) – Number of threads to be used during training and prediction.

  • monotone_constraints (dict, optional) – Constraints to enforce a specific relationship between features and target. Keys are feature indices or names, values are -1, 1, or 0.

  • force_children_to_bound_parent (bool, default=False) – Whether to restrict children nodes to be within the parent’s range.

  • missing (float, default=np.nan) – Value to consider as missing data.

  • allow_missing_splits (bool, default=True) – Whether to allow splits that separate missing from non-missing values.

  • create_missing_branch (bool, default=False) – Whether to create a separate branch for missing values (ternary trees).

  • terminate_missing_features (iterable, optional) – Features for which missing branches will always be terminated if create_missing_branch is True.

  • missing_node_treatment (str, default="None") – How to handle weights for missing nodes if create_missing_branch is True. Options: “None”, “AssignToParent”, “AverageLeafWeight”, “AverageNodeWeight”.

  • log_iterations (int, default=0) – Logging frequency (every N iterations). 0 disables logging.

  • feature_importance_method (str, default="Gain") – Method for calculating feature importance. Options: “Gain”, “Weight”, “Cover”, “TotalGain”, “TotalCover”.

  • quantile (float, optional) – Target quantile for quantile regression (objective=”QuantileLoss”).

  • reset (bool, optional) – Whether to reset the model or continue training on subsequent calls to fit.

  • categorical_features (str or iterable, default="auto") – Feature indices or names to treat as categorical.

  • timeout (float, optional) – Time limit for fitting in seconds.

  • iteration_limit (int, optional) – Maximum number of boosting iterations.

  • memory_limit (float, optional) – Memory limit for training in GB.

  • stopping_rounds (int, optional) – Early stopping rounds.

  • max_bin (int, default=256) – Maximum number of bins for feature discretization.

  • max_cat (int, default=1000) – Maximum unique categories before a feature is treated as numerical.

  • interaction_constraints (list of list of int, optional) – Interaction constraints.

  • save_node_stats (bool, default=False) – Whether to save node statistics (required for calibration).

  • **kwargs – Arbitrary keyword arguments to be passed to the base class.

fit(X, y, sample_weight=None, **fit_params) Self[source]

Fit the regressor on training data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training input samples.

  • y (array-like of shape (n_samples,)) – Target values.

  • sample_weight (array-like of shape (n_samples,), optional) – Individual weights for each sample.

  • **fit_params – Additional keyword arguments forwarded to the base fit.

Returns:

Fitted estimator.

Return type:

self

score(X, y, sample_weight=None)[source]

Return the coefficient of determination ($R^2$) of the prediction.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,)) – True target values.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Returns:

$R^2$ score of self.predict(X) w.r.t. y.

Return type:

float

class perpetual.sklearn.PerpetualRanker(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'ListNetLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, max_bin: int = 256, max_cat: int = 1000, **kwargs)[source]

Bases: PerpetualBooster, RegressorMixin

A scikit-learn compatible ranker based on PerpetualBooster. Uses ‘ListNetLoss’ as the default objective. Requires the ‘group’ parameter to be passed to fit.

__init__(*, objective: str | Tuple[LambdaType, LambdaType, LambdaType] = 'ListNetLoss', budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, max_bin: int = 256, max_cat: int = 1000, **kwargs)[source]

Gradient Boosting Machine with Perpetual Learning.

A self-generalizing gradient boosting machine that doesn’t need hyperparameter optimization. It automatically finds the best configuration based on the provided budget.

Parameters:
  • objective (str or tuple, default="ListNetLoss") –

    Learning objective function to be used for optimization. Valid options are:

    • ”ListNetLoss”: ListNet loss for ranking.

    • custom objective: a tuple of (loss, gradient, initial_value) functions. Each function should have the following signature:

      • loss(y, pred, weight, group) : returns the loss value for each sample.

      • gradient(y, pred, weight, group) : returns a tuple of (gradient, hessian). If the hessian is constant (e.g., 1.0 for SquaredLoss), return None to improve performance.

      • initial_value(y, weight, group) : returns the initial value for the booster.

  • budget (float, default=0.5) – A positive number for fitting budget. Increasing this number will more likely result in more boosting rounds and increased predictive power.

  • num_threads (int, optional) – Number of threads to be used during training and prediction.

  • monotone_constraints (dict, optional) – Constraints to enforce a specific relationship between features and target. Keys are feature indices or names, values are -1, 1, or 0.

  • force_children_to_bound_parent (bool, default=False) – Whether to restrict children nodes to be within the parent’s range.

  • missing (float, default=np.nan) – Value to consider as missing data.

  • allow_missing_splits (bool, default=True) – Whether to allow splits that separate missing from non-missing values.

  • create_missing_branch (bool, default=False) – Whether to create a separate branch for missing values (ternary trees).

  • terminate_missing_features (iterable, optional) – Features for which missing branches will always be terminated if create_missing_branch is True.

  • missing_node_treatment (str, default="None") – How to handle weights for missing nodes if create_missing_branch is True. Options: “None”, “AssignToParent”, “AverageLeafWeight”, “AverageNodeWeight”.

  • log_iterations (int, default=0) – Logging frequency (every N iterations). 0 disables logging.

  • feature_importance_method (str, default="Gain") – Method for calculating feature importance. Options: “Gain”, “Weight”, “Cover”, “TotalGain”, “TotalCover”.

  • quantile (float, optional) – Target quantile for quantile regression (objective=”QuantileLoss”).

  • reset (bool, optional) – Whether to reset the model or continue training on subsequent calls to fit.

  • categorical_features (str or iterable, default="auto") – Feature indices or names to treat as categorical.

  • timeout (float, optional) – Time limit for fitting in seconds.

  • iteration_limit (int, optional) – Maximum number of boosting iterations.

  • memory_limit (float, optional) – Memory limit for training in GB.

  • stopping_rounds (int, optional) – Early stopping rounds.

  • max_bin (int, default=256) – Maximum number of bins for feature discretization.

  • max_cat (int, default=1000) – Maximum unique categories before a feature is treated as numerical.

  • interaction_constraints (list of list of int, optional) – Interaction constraints.

  • **kwargs – Arbitrary keyword arguments to be passed to the base class.

fit(X, y, group=None, sample_weight=None, **fit_params) Self[source]

Fit the ranker on training data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training input samples.

  • y (array-like of shape (n_samples,)) – Target relevance scores.

  • group (array-like of int, optional) – Group lengths used by the ranking objective. Required when objective="ListNetLoss".

  • sample_weight (array-like of shape (n_samples,), optional) – Individual weights for each sample.

  • **fit_params – Additional keyword arguments forwarded to the base fit.

Returns:

Fitted estimator.

Return type:

self

Raises:

ValueError – If group is None and the objective is "ListNetLoss".

Causal ML

class perpetual.iv.BraidedBooster(treatment_objective: str = 'SquaredLoss', outcome_objective: str = 'SquaredLoss', stage1_budget: float = 0.5, stage2_budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, force_children_to_bound_parent: bool = False, missing: float = nan, allow_missing_splits: bool = True, create_missing_branch: bool = False, terminate_missing_features: Iterable[Any] | None = None, missing_node_treatment: str = 'None', log_iterations: int = 0, quantile: float | None = None, reset: bool | None = None, categorical_features: Iterable[int] | Iterable[str] | str | None = 'auto', timeout: float | None = None, iteration_limit: int | None = None, memory_limit: float | None = None, stopping_rounds: int | None = None, max_bin: int = 256, max_cat: int = 1000, interaction_constraints: list[list[int]] | None = None)[source]

Bases: object

Two-stage instrumental-variable estimator powered by gradient boosting.

Stage 1 regresses the treatment on the instruments, and Stage 2 regresses the outcome on the predicted treatment and covariates. Both stages are fitted using Perpetual’s self-generalizing boosting.

__init__(treatment_objective: str = 'SquaredLoss', outcome_objective: str = 'SquaredLoss', stage1_budget: float = 0.5, stage2_budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, force_children_to_bound_parent: bool = False, missing: float = nan, allow_missing_splits: bool = True, create_missing_branch: bool = False, terminate_missing_features: Iterable[Any] | None = None, missing_node_treatment: str = 'None', log_iterations: int = 0, quantile: float | None = None, reset: bool | None = None, categorical_features: Iterable[int] | Iterable[str] | str | None = 'auto', timeout: float | None = None, iteration_limit: int | None = None, memory_limit: float | None = None, stopping_rounds: int | None = None, max_bin: int = 256, max_cat: int = 1000, interaction_constraints: list[list[int]] | None = None)[source]

Boosted Instrumental Variable (BoostIV) Estimator.

Implements a 2-Stage Least Squares (2SLS) approach using Gradient Boosting.

Parameters:
  • treatment_objective (str, default="SquaredLoss") – Objective for Stage 1 (Treatment Model). e.g., "SquaredLoss" or "LogLoss".

  • outcome_objective (str, default="SquaredLoss") – Objective for Stage 2 (Outcome Model). e.g., "SquaredLoss".

  • stage1_budget (float, default=0.5) – Fitting budget for Stage 1. Higher values allow more boosting rounds.

  • stage2_budget (float, default=0.5) – Fitting budget for Stage 2. Higher values allow more boosting rounds.

  • num_threads (int, optional) – Number of threads to use during training and prediction.

  • monotone_constraints (dict, optional) – Constraints mapping feature indices/names to -1, 1, or 0.

  • force_children_to_bound_parent (bool, default=False) – Whether to restrict children nodes to be within the parent’s range.

  • missing (float, default=np.nan) – Value to consider as missing data.

  • allow_missing_splits (bool, default=True) – Whether to allow splits that separate missing from non-missing values.

  • create_missing_branch (bool, default=False) – Whether to create a separate branch for missing values (ternary trees).

  • terminate_missing_features (iterable, optional) – Features for which missing branches are always terminated when create_missing_branch is True.

  • missing_node_treatment (str, default="None") – How to handle weights for missing nodes. Options: "None", "AssignToParent", "AverageLeafWeight", "AverageNodeWeight".

  • log_iterations (int, default=0) – Logging frequency (every N iterations). 0 disables logging.

  • quantile (float, optional) – Target quantile when using "QuantileLoss".

  • reset (bool, optional) – Whether to reset the model or continue training on subsequent fits.

  • categorical_features (iterable or str, default="auto") – Feature indices or names to treat as categorical.

  • timeout (float, optional) – Time limit for fitting in seconds.

  • iteration_limit (int, optional) – Maximum number of boosting iterations.

  • memory_limit (float, optional) – Memory limit for training in GB.

  • stopping_rounds (int, optional) – Number of rounds without improvement before stopping.

  • max_bin (int, default=256) – Maximum number of bins for feature discretization.

  • max_cat (int, default=1000) – Maximum unique categories before a feature is treated as numerical.

  • interaction_constraints (list of list of int, optional) – Groups of feature indices allowed to interact.

fit(X, Z, y, w) Self[source]

Fit the IV model.

Parameters:
  • X (array-like) – Covariates (Controls).

  • Z (array-like) – Instruments.

  • y (array-like) – Outcome variable.

  • w (array-like) – Treatment received.

predict(X, w_counterfactual) ndarray[source]

Predict Outcome given X and a counterfactual W.

Parameters:
  • X (array-like) – Covariates.

  • w_counterfactual (array-like) – Treatment value to simulate.

Returns:

preds – Predicted Outcome.

Return type:

ndarray

to_json() str[source]

Serialize model to JSON string.

classmethod from_json(json_str: str) BraidedBooster[source]

Deserialize model from JSON string.

class perpetual.meta_learners.SLearner(budget: float = 0.5, **kwargs)[source]

Bases: object

S-Learner (Single Learner) for Heterogeneous Treatment Effect (HTE) estimation.

Uses a single model to estimate the outcome: Y ~ M(X, W)

The CATE is obtained by contrasting predictions under treatment and control: CATE = M(X, 1) - M(X, 0).

__init__(budget: float = 0.5, **kwargs)[source]

Create an S-Learner.

Parameters:
  • budget (float, default=0.5) – Fitting budget forwarded to the Rust backend.

  • **kwargs – Additional keyword arguments forwarded to PerpetualBooster.

fit(X, w, y) Self[source]

Fit the single model on covariates augmented with treatment.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Covariate matrix.

  • w (array-like of shape (n_samples,)) – Binary treatment indicator (0 or 1).

  • y (array-like of shape (n_samples,)) – Observed outcome.

Returns:

Fitted estimator.

Return type:

self

property feature_importances_: ndarray

Feature importance of the single model (excluding treatment feature).

predict(X) ndarray[source]

Estimate the CATE as M(X, 1) - M(X, 0).

Parameters:

X (array-like of shape (n_samples, n_features)) – Covariate matrix.

Returns:

Estimated treatment effect for each sample.

Return type:

ndarray of shape (n_samples,)

class perpetual.meta_learners.TLearner(budget: float = 0.5, **kwargs)[source]

Bases: object

T-Learner (Two Learners) for Heterogeneous Treatment Effect (HTE) estimation.

Uses two separate models:

M0(X) ~ Y[W=0]
M1(X) ~ Y[W=1]

The CATE is M1(X) - M0(X).

__init__(budget: float = 0.5, **kwargs)[source]

Create a T-Learner.

Parameters:
  • budget (float, default=0.5) – Fitting budget forwarded to the Rust backend.

  • **kwargs – Additional keyword arguments forwarded to PerpetualBooster.

fit(X, w, y) Self[source]
property feature_importances_: ndarray

Aggregated feature importance from mu0 and mu1.

predict(X) ndarray[source]
class perpetual.meta_learners.XLearner(budget: float = 0.5, propensity_budget: float | None = None, **kwargs)[source]

Bases: object

X-Learner for HTE estimation (typically better for imbalanced treatment groups).

__init__(budget: float = 0.5, propensity_budget: float | None = None, **kwargs)[source]
fit(X, w, y) Self[source]
property feature_importances_: ndarray

Aggregated feature importance from the second-stage effect models (tau0, tau1).

predict(X) ndarray[source]
class perpetual.meta_learners.DRLearner(budget: float = 0.5, propensity_budget: float | None = None, **kwargs)[source]

Bases: object

Doubly Robust (DR) Learner for heterogeneous treatment effect estimation.

__init__(budget: float = 0.5, propensity_budget: float | None = None, **kwargs)[source]
fit(X, w, y) Self[source]
property feature_importances_: ndarray

Feature importance from the outcome model fitted on pseudo-outcomes.

predict(X) ndarray[source]

Double Machine Learning

class perpetual.dml.DMLEstimator(budget: float = 0.5, n_folds: int = 2, clip: float = 0.01, **kwargs)[source]

Bases: object

Double Machine Learning (DML) estimator for heterogeneous treatment effects.

Uses three gradient boosting stages with K-fold cross-fitting to learn \(\\theta(X)\) — the Conditional Average Treatment Effect (CATE).

__init__(budget: float = 0.5, n_folds: int = 2, clip: float = 0.01, **kwargs)[source]
fit(X, w, y) Self[source]

Fit the DML estimator with cross-fitting.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Covariate matrix.

  • w (array-like of shape (n_samples,)) – Treatment variable (continuous or binary).

  • y (array-like of shape (n_samples,)) – Outcome variable.

Returns:

Fitted estimator.

Return type:

self

predict(X) ndarray[source]

Predict CATE (heterogeneous treatment effect).

Parameters:

X (array-like of shape (n_samples, n_features)) – Covariate matrix.

Returns:

Estimated \(\theta(X)\) for each sample.

Return type:

ndarray of shape (n_samples,)

Uplift Modeling

class perpetual.uplift.UpliftBooster(outcome_budget: float = 0.5, propensity_budget: float = 0.5, effect_budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, force_children_to_bound_parent: bool = False, missing: float = nan, allow_missing_splits: bool = True, create_missing_branch: bool = False, terminate_missing_features: Iterable[Any] | None = None, missing_node_treatment: str = 'None', log_iterations: int = 0, quantile: float | None = None, reset: bool | None = None, categorical_features: Iterable[int] | Iterable[str] | str | None = 'auto', timeout: float | None = None, iteration_limit: int | None = None, memory_limit: float | None = None, stopping_rounds: int | None = None, max_bin: int = 256, max_cat: int = 1000, interaction_constraints: list[list[int]] | None = None)[source]

Bases: object

R-Learner uplift model for estimating heterogeneous treatment effects.

Learns the Conditional Average Treatment Effect (CATE) tau(x) = E[Y | X, W=1] - E[Y | X, W=0] using three sequentially fitted gradient boosting models: an outcome model, a propensity model, and an effect model.

__init__(outcome_budget: float = 0.5, propensity_budget: float = 0.5, effect_budget: float = 0.5, num_threads: int | None = None, monotone_constraints: Dict[Any, int] | None = None, force_children_to_bound_parent: bool = False, missing: float = nan, allow_missing_splits: bool = True, create_missing_branch: bool = False, terminate_missing_features: Iterable[Any] | None = None, missing_node_treatment: str = 'None', log_iterations: int = 0, quantile: float | None = None, reset: bool | None = None, categorical_features: Iterable[int] | Iterable[str] | str | None = 'auto', timeout: float | None = None, iteration_limit: int | None = None, memory_limit: float | None = None, stopping_rounds: int | None = None, max_bin: int = 256, max_cat: int = 1000, interaction_constraints: list[list[int]] | None = None)[source]

Uplift Boosting Machine (R-Learner).

Estimates the Conditional Average Treatment Effect (CATE): tau(x) = E[Y | X, W=1] - E[Y | X, W=0].

Parameters:
  • outcome_budget (float, default=0.5) – Fitting budget for the outcome model mu(x). Higher values allow more boosting rounds.

  • propensity_budget (float, default=0.5) – Fitting budget for the propensity model p(x). Higher values allow more boosting rounds.

  • effect_budget (float, default=0.5) – Fitting budget for the effect model tau(x). Higher values allow more boosting rounds.

  • num_threads (int, optional) – Number of threads to use during training and prediction.

  • monotone_constraints (dict, optional) – Constraints mapping feature indices/names to -1, 1, or 0.

  • force_children_to_bound_parent (bool, default=False) – Whether to restrict children nodes to be within the parent’s range.

  • missing (float, default=np.nan) – Value to consider as missing data.

  • allow_missing_splits (bool, default=True) – Whether to allow splits that separate missing from non-missing values.

  • create_missing_branch (bool, default=False) – Whether to create a separate branch for missing values (ternary trees).

  • terminate_missing_features (iterable, optional) – Features for which missing branches are always terminated when create_missing_branch is True.

  • missing_node_treatment (str, default="None") – How to handle weights for missing nodes. Options: "None", "AssignToParent", "AverageLeafWeight", "AverageNodeWeight".

  • log_iterations (int, default=0) – Logging frequency (every N iterations). 0 disables logging.

  • quantile (float, optional) – Target quantile when using "QuantileLoss".

  • reset (bool, optional) – Whether to reset the model or continue training on subsequent fits.

  • categorical_features (iterable or str, default="auto") – Feature indices or names to treat as categorical.

  • timeout (float, optional) – Time limit for fitting in seconds.

  • iteration_limit (int, optional) – Maximum number of boosting iterations.

  • memory_limit (float, optional) – Memory limit for training in GB.

  • stopping_rounds (int, optional) – Number of rounds without improvement before stopping.

  • max_bin (int, default=256) – Maximum number of bins for feature discretization.

  • max_cat (int, default=1000) – Maximum unique categories before a feature is treated as numerical.

  • interaction_constraints (list of list of int, optional) – Groups of feature indices allowed to interact.

fit(X, w, y) Self[source]

Fit the Uplift model.

Parameters:
  • X (array-like) – Covariates.

  • w (array-like) – Treatment indicator (0 or 1).

  • y (array-like) – Outcome variable.

predict(X) ndarray[source]

Predict CATE.

Parameters:

X (array-like) – Covariates.

Returns:

cate – Predicted Conditional Average Treatment Effect.

Return type:

ndarray

to_json() str[source]

Serialize model to JSON string.

classmethod from_json(json_str: str) UpliftBooster[source]

Deserialize model from JSON string.

Policy Learning

class perpetual.policy.PolicyLearner(budget: float = 0.5, mode: str = 'ipw', propensity_budget: float | None = None, **kwargs)[source]

Bases: object

Policy learner via Inverse Propensity Weighting.

Learns a treatment-assignment policy \(\\pi(X)\) that maximizes expected reward using the Athey & Wager (2021) policy-learning framework.

The learned policy assigns \(W = 1\) when the boosted score \(F(X) > 0\).

Parameters:
  • budget (float, default=0.5) – Fitting budget forwarded to PerpetualBooster.

  • mode (str, default="ipw") – "ipw" for standard Inverse Propensity Weighting or "aipw" for Augmented (Doubly Robust) IPW.

  • propensity_budget (float, optional) – Separate budget for the propensity model. If None, defaults to budget. Only used when propensity is not supplied to fit().

  • **kwargs – Additional keyword arguments forwarded to PerpetualBooster.

feature_importances_

Feature importances from the policy model.

Type:

ndarray of shape (n_features,)

Examples

>>> from perpetual.policy import PolicyLearner
>>> import numpy as np
>>> n = 500
>>> X = np.random.randn(n, 5)
>>> w = np.random.binomial(1, 0.5, n)
>>> y = X[:, 0] * w + np.random.randn(n) * 0.5
>>> pl = PolicyLearner(budget=0.3)
>>> pl.fit(X, w, y)
>>> policy = pl.predict(X)

References

Athey, S., & Wager, S. (2021). Policy learning with observational data. Econometrica, 89(1), 133-161.

__init__(budget: float = 0.5, mode: str = 'ipw', propensity_budget: float | None = None, **kwargs)[source]
fit(X, w, y, propensity: ndarray | None = None, mu_hat_1: ndarray | None = None, mu_hat_0: ndarray | None = None) Self[source]

Fit the policy learner.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Covariate matrix.

  • w (array-like of shape (n_samples,)) – Observed binary treatment assignment (0 or 1).

  • y (array-like of shape (n_samples,)) – Observed outcome.

  • propensity (array-like of shape (n_samples,), optional) – Estimated \(P(W=1|X)\). If None, a propensity model is fitted internally.

  • mu_hat_1 (array-like of shape (n_samples,), optional) – Predicted outcome under treatment \(\hat{\mu}_1(X)\). Required when mode="aipw".

  • mu_hat_0 (array-like of shape (n_samples,), optional) – Predicted outcome under control \(\hat{\mu}_0(X)\). Required when mode="aipw".

Returns:

Fitted estimator.

Return type:

self

predict(X) ndarray[source]

Predict the optimal treatment assignment.

Parameters:

X (array-like of shape (n_samples, n_features)) – Covariate matrix.

Returns:

Binary treatment policy (1 = treat, 0 = do not treat).

Return type:

ndarray of shape (n_samples,)

decision_function(X) ndarray[source]

Return raw policy scores.

Positive values indicate treatment is beneficial.

Parameters:

X (array-like of shape (n_samples, n_features)) – Covariate matrix.

Returns:

Raw boosted policy scores.

Return type:

ndarray of shape (n_samples,)

predict_proba(X) ndarray[source]

Predict probability that treatment is beneficial.

Parameters:

X (array-like of shape (n_samples, n_features)) – Covariate matrix.

Returns:

Probability of treatment being beneficial (sigmoid of score).

Return type:

ndarray of shape (n_samples,)

Causal Metrics

perpetual.causal_metrics.cumulative_gain_curve(y_true: ndarray, w_true: ndarray, uplift_score: ndarray) Tuple[ndarray, ndarray][source]

Compute the cumulative gain (uplift) curve.

Samples are sorted by uplift_score in descending order. At each fraction of the population the observed uplift (difference in conversion rates between treated and control) multiplied by the fraction of the population seen so far is computed.

Parameters:
  • y_true (array-like of shape (n_samples,)) – Observed binary outcome (0 or 1).

  • w_true (array-like of shape (n_samples,)) – Observed binary treatment (0 or 1).

  • uplift_score (array-like of shape (n_samples,)) – Predicted CATE / uplift score (higher ⇒ more benefit from treatment).

Returns:

  • fractions (ndarray of shape (n_samples,)) – Fraction of population from 0 to 1.

  • gains (ndarray of shape (n_samples,)) – Cumulative gain at each fraction.

perpetual.causal_metrics.auuc(y_true: ndarray, w_true: ndarray, uplift_score: ndarray, normalize: bool = True) float[source]

Area Under the Uplift Curve (AUUC).

Parameters:
  • y_true (array-like of shape (n_samples,)) – Observed binary outcome.

  • w_true (array-like of shape (n_samples,)) – Observed binary treatment indicator.

  • uplift_score (array-like of shape (n_samples,)) – Predicted CATE / uplift score.

  • normalize (bool, default=True) – If True, subtract the area of a random model (diagonal) so that a random model scores 0.

Returns:

AUUC value.

Return type:

float

perpetual.causal_metrics.qini_curve(y_true: ndarray, w_true: ndarray, uplift_score: ndarray) Tuple[ndarray, ndarray][source]

Compute the Qini curve.

The Qini curve counts the incremental number of positive outcomes attributable to treatment as a function of the population fraction targeted.

Parameters:
  • y_true (array-like of shape (n_samples,)) – Observed binary outcome.

  • w_true (array-like of shape (n_samples,)) – Observed binary treatment indicator.

  • uplift_score (array-like of shape (n_samples,)) – Predicted CATE / uplift score.

Returns:

  • fractions (ndarray of shape (n_samples + 1,)) – Population fraction (starts at 0).

  • qini (ndarray of shape (n_samples + 1,)) – Qini value at each fraction (starts at 0).

perpetual.causal_metrics.qini_coefficient(y_true: ndarray, w_true: ndarray, uplift_score: ndarray) float[source]

Qini coefficient: area between the Qini curve and the random diagonal.

Parameters:
  • y_true (array-like of shape (n_samples,)) – Observed binary outcome.

  • w_true (array-like of shape (n_samples,)) – Observed binary treatment indicator.

  • uplift_score (array-like of shape (n_samples,)) – Predicted CATE / uplift score.

Returns:

Qini coefficient value.

Return type:

float

Fairness

class perpetual.fairness.FairClassifier(sensitive_feature: int, fairness_type: str = 'demographic_parity', lam: float = 1.0, budget: float = 0.5, **kwargs)[source]

Bases: object

Fairness-aware gradient boosting classifier.

Wraps a PerpetualBooster with an in-processing fairness penalty that regularizes the log-loss gradient to reduce dependence of predictions on a sensitive attribute.

Parameters:
  • sensitive_feature (int) – Column index of the sensitive attribute in X. The column must be binary (0 or 1).

  • fairness_type (str, default="demographic_parity") –

    Fairness criterion. One of:

    • "demographic_parity" — penalize overall disparity.

    • "equalized_odds" — penalize disparity within each label class.

  • lam (float, default=1.0) – Strength of the fairness penalty (\(\\lambda\)).

  • budget (float, default=0.5) – Fitting budget forwarded to PerpetualBooster.

  • **kwargs – Additional keyword arguments forwarded to PerpetualBooster.

feature_importances_

Feature importances from the fitted model.

Type:

ndarray of shape (n_features,)

Examples

>>> from perpetual.fairness import FairClassifier
>>> import numpy as np
>>> X = np.column_stack([np.random.randn(200, 3),
...                      np.random.binomial(1, 0.5, 200)])
>>> y = (X[:, 0] > 0).astype(float)
>>> clf = FairClassifier(sensitive_feature=3, lam=2.0)
>>> clf.fit(X, y)
>>> probs = clf.predict_proba(X)

Notes

The fairness penalty is applied only through the gradient; the reported loss is standard log-loss. This mirrors the Rust FairnessObjective implementation.

__init__(sensitive_feature: int, fairness_type: str = 'demographic_parity', lam: float = 1.0, budget: float = 0.5, **kwargs)[source]
fit(X, y) Self[source]

Fit the fair classifier.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Feature matrix. The column at index sensitive_feature must contain binary (0/1) values.

  • y (array-like of shape (n_samples,)) – Binary target variable (0 or 1).

Returns:

Fitted estimator.

Return type:

self

predict(X) ndarray[source]

Predict class labels.

Parameters:

X (array-like of shape (n_samples, n_features)) – Feature matrix.

Returns:

Predicted class labels (0 or 1).

Return type:

ndarray of shape (n_samples,)

predict_proba(X) ndarray[source]

Predict class probabilities.

Parameters:

X (array-like of shape (n_samples, n_features)) – Feature matrix.

Returns:

Predicted probabilities for class 0 and class 1.

Return type:

ndarray of shape (n_samples, 2)

predict_contributions(X, method: str = 'Average', parallel: bool | None = None) ndarray[source]

Predict feature contributions (SHAP-like values) for new data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input features.

  • method (str, default="Average") – Method to calculate contributions.

  • parallel (bool, optional) – Whether to run prediction in parallel.

Returns:

Feature contributions.

Return type:

ndarray of shape (n_samples, n_features + 1)

decision_function(X) ndarray[source]

Return raw log-odds scores.

Parameters:

X (array-like of shape (n_samples, n_features)) – Feature matrix.

Returns:

Raw boosted scores (log-odds).

Return type:

ndarray of shape (n_samples,)

Regulatory Risk

class perpetual.risk.PerpetualRiskEngine(model: PerpetualBooster)[source]

Bases: object

Risk Engine for generating Adverse Action (Reason) Codes.

This engine wraps a fitted PerpetualBooster model and provides functionality to explain rejections (Adverse Actions) by attributing the negative decision to specific features.

__init__(model: PerpetualBooster)[source]

Wrap a fitted booster for reason-code generation.

Parameters:

model (PerpetualBooster) – A fitted PerpetualBooster instance.

generate_reason_codes(X, threshold: float, n_codes: int = 3, method: str = 'Average', rejection_direction: str = 'lower') List[List[str]][source]

Generate reason codes for samples that fall below/above the approval threshold.

Logic: 1. Predict score for X. 2. Identify rejected samples based on rejection_direction:

  • “lower”: score < threshold (e.g. FICO score)

  • “higher”: score > threshold (e.g. Default probability)

  1. Identify top N features dragging the score in the rejected direction.

Parameters:
  • X (array-like) – Applicant data.

  • threshold (float) – Approval threshold.

  • n_codes (int, default=3) – Number of reason codes to return per applicant.

  • method (str, default="Average") – Contribution method.

  • rejection_direction ({"lower", "higher"}, default="lower") – Direction of rejection. If “lower”, scores below threshold are rejected. If “higher”, scores above threshold are rejected.

Returns:

reasons – For each sample, a list of reason-code strings. Approved samples get an empty list.

Return type:

list of list of str