RUGClassifier¶

RUGClassifier is a machine learning model designed for classification tasks, emphasizing the generation and optimization of rules.

The unique aspect of the RUGClassifier lies in its iterative process of refining the rule set. It starts with a basic decision tree fitted on the original dataset. As the process unfolds, more trees are fitted on subsets of the data, which are weighted according to the solutions of linear programming problems. This method concentrates the learning on areas where the model currently underperforms, ensuring that subsequent iterations focus on improving these weak spots.

This classifier operates by solving Restricted Master Problems (RMPs) to iteratively enhance its objective function, which aims to find a balance between model accuracy and the simplicity of the rules it generates. This balance is crucial for maintaining the interpretability of the model while striving for high performance.

The approach allows for a detailed tuning of the model through various parameters such as penalty parameters for controlling the balance between model complexity and accuracy, costs associated with rules to manage their complexity, and thresholds for determining the significance of rules in the final model.

In essence, RUGClassifier is built to offer a blend of accuracy and interpretability in classification tasks, making it suitable for applications where understanding the model’s decision-making process is as important as the accuracy of its predictions.

Oblique Splits¶

RUGClassifier supports oblique (multi-feature) splits through the use_oblique and n_pair parameters. When enabled, decision tree nodes can split on linear combinations of features (e.g., 0.73*x1 + -1.00*x2 < 0.15) instead of single-feature thresholds only.

Note that enabling oblique splits reduces interpretability, since each clause involves a weighted combination of multiple features rather than a simple threshold on a single feature. To minimize this trade-off, we recommend setting n_pair=2 so that each oblique clause combines at most two features.

class ruleopt.RUGClassifier¶

Rule Generation algorithm for multi-class classification. This algorithm aims at producing a compact and interpretable model by employing optimization-based rule learning.

__init__(solver=<ruleopt.solver.highs_solver.HiGHSSolver object>, rule_cost=<ruleopt.rule_cost.rule_cost.Gini object>, max_rmp_calls=10, threshold=1e-06, random_state=None, class_weight=None, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, ccp_alpha=0.0, categories=None, use_oblique=False, n_pair=2)¶

Parameters:

solver (OptimizationSolver, default=HiGHSSolver()) – An instance of a derived class inherits from the ‘Optimization Solver’ base class.
rule_cost (RuleCost or int, default=Gini()) – Defines the cost of rules, either as a specific calculation method (RuleCost instance) or a fixed cost
max_rmp_calls (int, default=10) – Maximum number of Restricted Master Problem (RMP) iterations allowed during fitting.
class_weight (dict, "balanced" or None, default=None) – A dictionary mapping class labels to their respective weights, the string “balanced” to automatically adjust weights inversely proportional to class frequencies, or None for no weights.
threshold (float, default=1.0e-6) – The minimum weight threshold for including a rule in the final model.
random_state (int or None, default=None) – Seed for the random number generator to ensure reproducible results.
criterion ({"gini", "entropy"}, default="gini") – The function to measure the quality of a split.
max_depth (int, default=None) – The maximum depth of the tree. If None, nodes are expanded until all leaves are pure or contain fewer than min_samples_split samples.
min_samples_split (int, default=2) – The minimum number of samples required to split an internal node.
min_samples_leaf (int, default=1) – The minimum number of samples required to be at a leaf node.
ccp_alpha (non-negative float, default=0.0) – Complexity parameter used for Minimal Cost-Complexity Pruning.
categories (list or None, default=None) – List of column indices representing categorical features.
use_oblique (bool, default=False) – Whether to use oblique (multi-feature) splits in the decision trees.
n_pair (int, default=2) – Number of features to combine per oblique split. Only used when use_oblique=True.

property classes¶

Returns unique class labels in the dataset.

Returns:: An array containing the unique class labels of the dataset.
Return type:: np.ndarray

property coefficients¶

Stores coefficients associated with the rules during optimization.

Returns:: An object or array-like structure storing coefficients related to each rule.
Return type:: Coefficients

property decision_rules¶

Returns the rules extracted from the decision trees, after optimization.

Returns:: A dictionary where keys are rule indices and values are Rule objects.
Return type:: Dict[int, Rule]

property decision_trees¶

Returns dictionary that stores the decision tree models.

Returns:: A dictionary containing decision tree models, with identifiers as keys and decision tree instances as values.
Return type:: Dict[int, Any]

fit(x, y, sample_weight=None)¶

property is_fitted¶

Indicates whether the model is fitted.

Returns:: True if the model is fitted, False otherwise.
Return type:: bool

property k¶

Returns the total number of unique classes in the dataset.

Returns:: The total number of unique classes.
Return type:: float

property majority_class¶

Returns the class label of the majority class in the dataset.

Returns:: The label of the majority class.
Return type:: int

property majority_probability¶

Returns the probability of the majority class in the dataset.

Returns:: The probability of encountering the majority class in the dataset.
Return type:: float

predict(x, indices=None, threshold=0.0, *, predict_info=False)¶

Predicts class labels for the given data, optionally returning additional prediction info.

Parameters:

x (array-like of shape (n_samples, n_features)) – The training input samples. Internally, it will be converted to dtype=np.float32.
indices (list or None, default=None) – Specific indices of rules to use for prediction. If None, all rules are used.
threshold (float, default=0) – The threshold for selecting rules based on their weights.
predict_info (bool, default=False) – If True, returns additional information about the prediction process including indices of samples with missed values, number of rules applied per sample, and average rule length per sample. Otherwise, returns only the predicted class labels.

Returns:

An array of predicted class labels for each instance in x. If predict_info is True, also returns arrays containing indices of samples with missed values, number of rules applied per sample, and average rule length per sample.

Return type:

np.ndarray

predict_proba(x, indices=None, threshold=0.0, *, predict_info=False)¶

Predicts class probabilities for the given data, optionally returning additional prediction info.

Parameters:

x (array-like of shape (n_samples, n_features)) – The training input samples. Internally, it will be converted to dtype=np.float32.
indices (list or None, default=None) – Specific indices of rules to use for calculating probabilities. If None, all rules are used.
threshold (float, default=0) – The threshold for selecting rules based on their weights.
predict_info (bool, default=False) – If True, returns additional information about the prediction process including indices of samples with missed values, number of rules applied per sample, and average rule length per sample. Otherwise, returns only the probabilities of each class for each sample.

Returns:

An array where each row corresponds to a sample in x and each column to a class, containing the probability of each class for each sample. If predict_info is True, also returns arrays containing indices of samples with missed values, number of rules applied per sample, and average rule length per sample.

Return type:

np.ndarray

property rule_columns¶

Returns indices of rules selected as part of the model.

Returns:: An array of indices corresponding to the rules included in the model.
Return type:: np.ndarray

property rule_info¶

Returns information about each rule.

Returns:: A dictionary with rule indices as keys and tuples containing information about each rule as values. The tuple structure is (rule_id, feature_index, threshold, values_array).
Return type:: Dict[int, Tuple[int, int, int, np.ndarray]]