Read the rendered documentation at https://tbjohns.github.io/BlitzML/.

BlitzML API reference¶

This is the complete API reference for the BlitzML python package.

Training L1-regularized models¶

L1 regularization is a popular approach to training sparse models. BlitzML efficiently solves L1-regularized problems with a variety of loss functions.

Problem classes¶

class blitzml.LassoProblem(A, b)[source]¶

Class for training sparse linear models with squared loss. The optimization objective is

$\tfrac{1}2 ||A w - b||^2 + \lambda ||w||_1 .$

Parameters:	A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix. b (numpy.ndarray) – Labels array of length n.

Blitz tries its best to avoid copying data, but depending on the formats of A and b, Blitz may make a copy of these arrays. To avoid copying, follow these guidelines:

If A is dense (numpy.ndarray), define A as an F-contiguous array. The dtype for A should match ctypes.c_double, ctypes.c_float, ctypes.c_int, or ctypes.c_bool—that is, A.dtype == ctypes.c_double evaluates to True, for example.
If A is sparse, define A as a scipy.sparse.csc_matrix. The dtype for A.indices should match type ctypes.c_int. The dtype for A.indptr should match type ctypes.c_size_t. BlitzML can work with double, float, int, and bool dtypes for A.data without copying.
The dtype for b should match type ctypes.c_double.

BlitzML will print a warning when objects with more than 1e7 elements are being copied. To suppress warnings, use blitzml.suppress_warnings().

compute_max_l1_penalty(include_bias_term=True)¶

Compute the smallest L1 penalty parameter for which all weights in the solution equal zero.

Parameters:	include_bias_term (bool, optional) – Whether to include an unregularized bias term in the model. Default is True.
Returns:	max_l1_penalty
Return type:	float

solve(l1_penalty, include_bias_term=True, initial_weights=None, stopping_tolerance=0.001, max_time=31540000.0, min_time=0.0, max_iterations=100000, verbose=False, log_directory=None)¶

Minimizes the objective

$\sum_i L(a_i^T w, b_i) + \texttt{l1\_penalty} ||w||_1 ,$

where L is the problem’s loss function.

Parameters:	l1_penalty (float > 0) – Regularization parameter for L1 norm penalty on weights. When this value is larger, the solution generally contains more zero entries and Blitz generally completes optimization faster. include_bias_term (bool, optional) – Whether to include an unregularized bias parameter in the model. Default is True. initial_weights (numpy.ndarray of length d, optional) – Initial weights to warm-start optimization. The solver requires less time if initialized to a good approximate solution. Default is None. stopping_tolerance (float, optional) – Stopping tolerance for solver. Optimization terminates if (duality_gap / objective_value) is less than stopping_tolerance. Default is 1e-3. max_time (float, optional) – Time limit in seconds for solving. If stopping tolerance is not reached, optimization terminates after this number of seconds. Default is 1 year. min_time (float, optional) – Minimum time in seconds for solving. Optimization continues until this amount of time passes, even after reaching stopping tolerance. Default is zero. max_iterations (int, optional) – Iterations limit for algorithm. If stopping tolerance is not reached after this number of iterations, optimization terminates. Default is 100000. verbose (bool, optional) – Whether to print information, such as objective value, to sys.stdout during optimization. Default is False. log_directory (string, optional) – Path to existing directory for Blitz to log time and objective value information. This directory should be empty prior to solving. Default is None.
Returns:	solution
Return type:	BlitzMLSolution

class blitzml.SparseLogisticRegressionProblem(A, b)[source]¶

Class for training sparse linear models with logistic loss. The optimization objective is

$\sum_i \log(1 + \exp(-b_i a_i^T w)) + \lambda ||w||_1 ,$

where i indexes the ith row in A and ith entry in b. Each label b_i should have value 1 or -1. BlitzML treats other label values as 1 if the value exceeds zero and -1 otherwise.

Calls to solve and compute_max_l1_penalty use interfaces identical to the same methods in blitzml.LassoProblem.

class blitzml.SparseHuberProblem(A, b)[source]¶

Class for training sparse linear models with huber loss. The optimization objective is

$\sum_i L(a_i^T w, b_i) + \lambda ||w||_1 ,$

where

$L(a_i^T w, b_i) = \left\{ \begin{array}{lll} \tfrac{1}2 (a_i^T w - b_i)^2 & & \text{if}\ |a_i^T w - b_i| < 1 \\[0.4em] a_i^T w - b_i - \onehalf & & \text{if}\ a_i^T w - b_i \geq 1 \\[0.4em] b_i - a_i^T w - \onehalf & & \text{otherwise.} \end{array} \right.$

Here i indexes the ith row in A and ith entry in b.

Calls to solve and compute_max_l1_penalty use interfaces identical to those in blitzml.LassoProblem.

class blitzml.SparseSquaredHingeProblem(A, b)[source]¶

Class for training sparse linear models with squared hinge loss. The optimization objective is

$\sum_i \onehalf (1 - b_i a_i^T w)_+^2 + \lambda ||w||_1 ,$

where the “+” subscript denotes the rectifier function. Each label should have value 1 or -1.

Calls to solve and compute_max_l1_penalty use interfaces identical to those in blitzml.LassoProblem.

class blitzml.SparseSmoothedHingeProblem(A, b)[source]¶

Class for training sparse linear models with smoothed hinge loss. The optimization objective is

$\sum_i L(a_i^T w, b_i) + \lambda ||w||_1 ,$

where

$L(a_i^T w, b_i) = \left\{ \begin{array}{lll} \onehalf - b_i a_i^T w && \text{if}\ b_i a_i^T w < 0 \\[0.4em] \onehalf (1 - b_i a_i^T w)^2 && \text{if}\ b_i a_i^T w \in [0,1) \\[0.4em] 0 && \text{otherwise.} \end{array} \right.$

Each label should have value 1 or -1.

Solution classes¶

class blitzml._sparse_linear.LassoSolution[source]¶

Solution class for LassoProblem.

bias¶: Value of model’s bias term.

compute_loss(A, b)¶

Compute the sum

$\sum_i L(a_i^T w, b_i) ,$

where L is the problem’s loss function.

Parameters:	A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix for evaluating loss. b (numpy.ndarray) – Corresponding labels array of length n.
Returns:	loss
Return type:	float

dual_solution¶: Dual solution to optimization problem.

duality_gap¶: Duality gap between primal and dual solutions (which upper bounds the suboptimality of these solutions).

objective_value¶: Objective value of solution.

predict(A)¶

Predict label values from feature vectors.

Parameters:	A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix to make predictions for.
Returns:	predictions
Return type:	numpy.ndarray of length n.

save(filepath)¶

Save model to disk.

Parameters:	filepath (string) – Location to save solution.

solution_status¶: Status of Blitz algorithm upon returning model.

weights¶: Array of model’s weight values.

class blitzml._sparse_linear.SparseLogisticRegressionSolution[source]¶

Solution class for SparseLogisticRegressionProblem.

Except for the following additional method, interface is identical to LassoSolution’s interface.

predict_probabilities(A)[source]¶

Predict probability values from feature vectors.

Parameters:	A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix to predict probabilities for.
Returns:	predicted_probabilities
Return type:	numpy.ndarray (length n with values in [0, 1])

class blitzml._sparse_linear.SparseHuberSolution[source]¶

Solution class for SparseHuberProblem.

Interface is identical to LassoSolution’s interface.

class blitzml._sparse_linear.SparseSquaredHingeSolution[source]¶

Solution class for SparseSquaredHingeProblem.

Interface is identical to LassoSolution’s interface.

class blitzml._sparse_linear.SparseSmoothedHingeSolution[source]¶

Solution class for SparseSquaredHingeProblem.

Interface is identical to LassoSolution’s interface.

Utility functions¶

blitzml.load_solution(filepath)[source]¶

Load BlitzMLSolution from disk.

Parameters:	filepath (string) – Path to saved solution.
Returns:	solution
Return type:	BlitzMLSolution.

blitzml.parse_log_directory(log_directory)[source]¶

Parse files logged by BlitzML during a solve call.

Parameters:	log_directory (string) – Path to directory containing log files to parse.
Returns:	logs – Iterable over information logged by BlitzML. Each item is a dictionary of logged values.
Return type:	generator

blitzml.suppress_warnings()[source]¶: Stop BlitzML from printing warning messages to sys.stdout. By default, warning messages are unsuppressed.

blitzml.unsuppress_warnings()[source]¶: Allow BlitzML to print warning messages to sys.stdout.