class documentation

Base class for sparse linear models.

The features are selected using an efficient cutting plane that scales to thousands of features and samples. As the parameters and fitting procedure are the same for both regression and classification models, this class implements the shared functionality.

Method __init__ Model constructor.
Method fit Fit the model to the training data.
Method predict Predict using the fitted model.
Instance Variable feature_groups Set of features that are mutually exclusive. For example, if feature_groups=[{0, 1}, {2, 3}], then at most one features 0 and 1 will be selected, and at most one features 2 and 3 will be selected. This can be used to encode prior knowledge about the problem.
Instance Variable gamma The regularization parameter. If None, then gamma is set to 1 / sqrt(n_samples).
Instance Variable k The sparsity parameter (i.e. number of non-zero coefficients). If None, then k is set to the square root of the number of features, rounded to the nearest integer.
Instance Variable max_iters The maximum number of iterations.
Instance Variable normalize Whether to normalize the data before fitting the model.
Instance Variable random_state Controls the random seed for the initial guess if a user-defined initial guess is not provided.
Instance Variable solver The solver to use for the optimization problem. The available options are "CBC" and "GUROBI". Support for the "HiGHS" solver is also planned for a future release.
Instance Variable start The initial guess for the selected features. For example if start={0, 1, 2}, then the first three features will be selected. If None, then the initial guess is randomly selected. Providing a good initial guess based on problem-specific knowledge can significantly speed up the search.
Instance Variable tol The tolerance for the stopping criterion.
Instance Variable verbose Whether to enable logging of the search progress.
Property coef Get the coefficients of the linear model.
Property intercept Get the intercept of the linear model.
Method _fit_coef_for_subset Undocumented
Method _get_coef Undocumented
Method _get_intercept Undocumented
Method _make_callback Undocumented
Method _pre_process_y Undocumented
Method _predict Undocumented
Method _validate_params Undocumented
Class Variable _parameter_constraints Undocumented
Instance Variable _coef Undocumented
Instance Variable _gamma Undocumented
Instance Variable _k Undocumented
Instance Variable _scaler_X Undocumented
def __init__(self, k: Optional[int] = None, gamma: Optional[float] = None, normalize: bool = True, max_iters: int = 500, tol: float = 0.0001, start: Optional[set[int]] = None, feature_groups: Optional[Sequence[set[int]]] = None, solver: str = 'CBC', random_state: Optional[int] = None, verbose: bool = False): (source)

Model constructor.

Parameters
k:Optional[int]The value for the k attribute.
gamma:Optional[float]The value for the gamma attribute.
normalize:boolThe value for the normalize attribute.
max_iters:intThe value for the max_iters attribute.
tol:floatThe value for the tol attribute.
start:Optional[set[int]]The value for the start attribute.
feature_groups:Optional[Sequence[set[int]]]The value for the feature_groups attribute.
solver:strThe value for the solver attribute.
random_state:Optional[int]The value for the random_state attribute.
verbose:boolThe value for the verbose attribute.
def fit(self, X: np.ndarray, y: np.ndarray) -> BaseSparseEstimator: (source)

Fit the model to the training data.

Parameters
X:np.ndarrayThe training data. The array should be of shape (n_samples, n_features).
y:np.ndarrayThe training labels. The array-like should be of shape (n_samples,).
Returns
BaseSparseEstimatorThe fitted model.
def predict(self, X: np.ndarray) -> np.ndarray: (source)

Predict using the fitted model.

Parameters
X:np.ndarrayThe training data. The array should be of shape (n_samples, n_features).
Returns
np.ndarrayThe predicted values. The array will be of shape (n_samples,).
feature_groups = (source)

Set of features that are mutually exclusive. For example, if feature_groups=[{0, 1}, {2, 3}], then at most one features 0 and 1 will be selected, and at most one features 2 and 3 will be selected. This can be used to encode prior knowledge about the problem.

The regularization parameter. If None, then gamma is set to 1 / sqrt(n_samples).

The sparsity parameter (i.e. number of non-zero coefficients). If None, then k is set to the square root of the number of features, rounded to the nearest integer.

max_iters = (source)

The maximum number of iterations.

normalize = (source)

Whether to normalize the data before fitting the model.

random_state = (source)

Controls the random seed for the initial guess if a user-defined initial guess is not provided.

The solver to use for the optimization problem. The available options are "CBC" and "GUROBI". Support for the "HiGHS" solver is also planned for a future release.

The initial guess for the selected features. For example if start={0, 1, 2}, then the first three features will be selected. If None, then the initial guess is randomly selected. Providing a good initial guess based on problem-specific knowledge can significantly speed up the search.

The tolerance for the stopping criterion.

Whether to enable logging of the search progress.

@property
coef: np.ndarray = (source)

Get the coefficients of the linear model.

Get the intercept of the linear model.

@abstractmethod
def _fit_coef_for_subset(self, X_subset: np.ndarray, y: np.ndarray) -> np.ndarray: (source)
@abstractmethod
def _make_callback(self, X: np.ndarray, y: np.ndarray) -> Callable[[np.ndarray], tuple[float, np.ndarray]]: (source)
@abstractmethod
def _pre_process_y(self, y: np.ndarray) -> np.ndarray: (source)
@abstractmethod
def _predict(self, X: np.ndarray, proba: bool = False) -> np.ndarray: (source)
def _validate_params(self): (source)

Undocumented

_parameter_constraints: ClassVar[dict[str, list]] = (source)

Undocumented

Undocumented

Undocumented

Undocumented

_scaler_X = (source)

Undocumented