sparsely.base.BaseSparseEstimator

class documentation

class BaseSparseEstimator(BaseEstimator, ABC): (source)

Known subclasses: sparsely.classifier.SparseLinearClassifier, sparsely.regressor.SparseLinearRegressor

Constructor: BaseSparseEstimator(k, gamma, normalize, max_iters, ...)

Base class for sparse linear models.

The features are selected using an efficient cutting plane that scales to thousands of features and samples. As the parameters and fitting procedure are the same for both regression and classification models, this class implements the shared functionality.

Method	`__init__`	Model constructor.
Method	`fit`	Fit the model to the training data.
Method	`predict`	Predict using the fitted model.
Instance Variable	`feature_groups`	Set of features that are mutually exclusive. For example, if `feature_groups=[{0, 1}, {2, 3}]`, then at most one features 0 and 1 will be selected, and at most one features 2 and 3 will be selected. This can be used to encode prior knowledge about the problem.
Instance Variable	`gamma`	The regularization parameter. If `None`, then `gamma` is set to `1 / sqrt(n_samples)`.
Instance Variable	`k`	The sparsity parameter (i.e. number of non-zero coefficients). If `None`, then `k` is set to the square root of the number of features, rounded to the nearest integer.
Instance Variable	`max_iters`	The maximum number of iterations.
Instance Variable	`normalize`	Whether to normalize the data before fitting the model.
Instance Variable	`random_state`	Controls the random seed for the initial guess if a user-defined initial guess is not provided.
Instance Variable	`solver`	The solver to use for the optimization problem. The available options are "CBC" and "GUROBI". Support for the "HiGHS" solver is also planned for a future release.
Instance Variable	`start`	The initial guess for the selected features. For example if `start={0, 1, 2}`, then the first three features will be selected. If `None`, then the initial guess is randomly selected. Providing a good initial guess based on problem-specific knowledge can significantly speed up the search.
Instance Variable	`tol`	The tolerance for the stopping criterion.
Instance Variable	`verbose`	Whether to enable logging of the search progress.
Property	`coef`	Get the coefficients of the linear model.
Property	`intercept`	Get the intercept of the linear model.
Method	`_fit_coef_for_subset`	Undocumented
Method	`_get_coef`	Undocumented
Method	`_get_intercept`	Undocumented
Method	`_make_callback`	Undocumented
Method	`_pre_process_y`	Undocumented
Method	`_predict`	Undocumented
Method	`_validate_params`	Undocumented
Class Variable	`_parameter_constraints`	Undocumented
Instance Variable	`_coef`	Undocumented
Instance Variable	`_gamma`	Undocumented
Instance Variable	`_k`	Undocumented
Instance Variable	`_scaler_X`	Undocumented

def __init__(self, k: Optional[int] = None, gamma: Optional[float] = None, normalize: bool = True, max_iters: int = 500, tol: float = 0.0001, start: Optional[set[int]] = None, feature_groups: Optional[Sequence[set[int]]] = None, solver: str = 'CBC', random_state: Optional[int] = None, verbose: bool = False): (source) ¶

Model constructor.

Parameters
k:`Optional[int]`	The value for the `k` attribute.
gamma:`Optional[float]`	The value for the `gamma` attribute.
normalize:`bool`	The value for the `normalize` attribute.
max_iters:`int`	The value for the `max_iters` attribute.
tol:`float`	The value for the `tol` attribute.
start:`Optional[set[int]]`	The value for the `start` attribute.
feature_groups:`Optional[Sequence[set[int]]]`	The value for the `feature_groups` attribute.
solver:`str`	The value for the `solver` attribute.
random_state:`Optional[int]`	The value for the `random_state` attribute.
verbose:`bool`	The value for the `verbose` attribute.

def fit(self, X: np.ndarray, y: np.ndarray) -> BaseSparseEstimator: (source) ¶

Fit the model to the training data.

Parameters
X:`np.ndarray`	The training data. The array should be of shape (n_samples, n_features).
y:`np.ndarray`	The training labels. The array-like should be of shape (n_samples,).
Returns
`BaseSparseEstimator`	The fitted model.

def predict(self, X: np.ndarray) -> np.ndarray: (source) ¶

Predict using the fitted model.

Parameters
X:`np.ndarray`	The training data. The array should be of shape (n_samples, n_features).
Returns
`np.ndarray`	The predicted values. The array will be of shape (n_samples,).

feature_groups = (source) ¶

Set of features that are mutually exclusive. For example, if feature_groups=[{0, 1}, {2, 3}], then at most one features 0 and 1 will be selected, and at most one features 2 and 3 will be selected. This can be used to encode prior knowledge about the problem.

gamma = (source) ¶

The regularization parameter. If None, then gamma is set to 1 / sqrt(n_samples).

k = (source) ¶

The sparsity parameter (i.e. number of non-zero coefficients). If None, then k is set to the square root of the number of features, rounded to the nearest integer.

max_iters = (source) ¶

The maximum number of iterations.

normalize = (source) ¶

Whether to normalize the data before fitting the model.

random_state = (source) ¶

Controls the random seed for the initial guess if a user-defined initial guess is not provided.

solver = (source) ¶

The solver to use for the optimization problem. The available options are "CBC" and "GUROBI". Support for the "HiGHS" solver is also planned for a future release.

start = (source) ¶

The initial guess for the selected features. For example if start={0, 1, 2}, then the first three features will be selected. If None, then the initial guess is randomly selected. Providing a good initial guess based on problem-specific knowledge can significantly speed up the search.

tol = (source) ¶

The tolerance for the stopping criterion.

verbose = (source) ¶

Whether to enable logging of the search progress.

@property
coef: np.ndarray = (source) ¶

Get the coefficients of the linear model.

@property
intercept: float = (source) ¶

Get the intercept of the linear model.

@abstractmethod
def _fit_coef_for_subset(self, X_subset: np.ndarray, y: np.ndarray) -> np.ndarray: (source) ¶