Read the rendered documentation at https://tbjohns.github.io/BlitzML/.

BlitzML API reference

This is the complete API reference for the BlitzML python package.

Training L1-regularized models

L1 regularization is a popular approach to training sparse models. BlitzML efficiently solves L1-regularized problems with a variety of loss functions.

Problem classes

class blitzml.LassoProblem(A, b)[source]

Class for training sparse linear models with squared loss. The optimization objective is

\tfrac{1}2 ||A w - b||^2 + \lambda ||w||_1 .

Parameters:
  • A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix.
  • b (numpy.ndarray) – Labels array of length n.

Blitz tries its best to avoid copying data, but depending on the formats of A and b, Blitz may make a copy of these arrays. To avoid copying, follow these guidelines:

  • If A is dense (numpy.ndarray), define A as an F-contiguous array. The dtype for A should match ctypes.c_double, ctypes.c_float, ctypes.c_int, or ctypes.c_bool—that is, A.dtype == ctypes.c_double evaluates to True, for example.
  • If A is sparse, define A as a scipy.sparse.csc_matrix. The dtype for A.indices should match type ctypes.c_int. The dtype for A.indptr should match type ctypes.c_size_t. BlitzML can work with double, float, int, and bool dtypes for A.data without copying.
  • The dtype for b should match type ctypes.c_double.

BlitzML will print a warning when objects with more than 1e7 elements are being copied. To suppress warnings, use blitzml.suppress_warnings().

compute_max_l1_penalty(include_bias_term=True)

Compute the smallest L1 penalty parameter for which all weights in the solution equal zero.

Parameters:include_bias_term (bool, optional) – Whether to include an unregularized bias term in the model. Default is True.
Returns:max_l1_penalty
Return type:float
solve(l1_penalty, include_bias_term=True, initial_weights=None, stopping_tolerance=0.001, max_time=31540000.0, min_time=0.0, max_iterations=100000, verbose=False, log_directory=None)

Minimizes the objective

\sum_i L(a_i^T w, b_i) + \texttt{l1\_penalty} ||w||_1 ,

where L is the problem’s loss function.

Parameters:
  • l1_penalty (float > 0) – Regularization parameter for L1 norm penalty on weights. When this value is larger, the solution generally contains more zero entries and Blitz generally completes optimization faster.
  • include_bias_term (bool, optional) – Whether to include an unregularized bias parameter in the model. Default is True.
  • initial_weights (numpy.ndarray of length d, optional) – Initial weights to warm-start optimization. The solver requires less time if initialized to a good approximate solution. Default is None.
  • stopping_tolerance (float, optional) – Stopping tolerance for solver. Optimization terminates if (duality_gap / objective_value) is less than stopping_tolerance. Default is 1e-3.
  • max_time (float, optional) – Time limit in seconds for solving. If stopping tolerance is not reached, optimization terminates after this number of seconds. Default is 1 year.
  • min_time (float, optional) – Minimum time in seconds for solving. Optimization continues until this amount of time passes, even after reaching stopping tolerance. Default is zero.
  • max_iterations (int, optional) – Iterations limit for algorithm. If stopping tolerance is not reached after this number of iterations, optimization terminates. Default is 100000.
  • verbose (bool, optional) – Whether to print information, such as objective value, to sys.stdout during optimization. Default is False.
  • log_directory (string, optional) – Path to existing directory for Blitz to log time and objective value information. This directory should be empty prior to solving. Default is None.
Returns:

solution

Return type:

BlitzMLSolution

class blitzml.SparseLogisticRegressionProblem(A, b)[source]

Class for training sparse linear models with logistic loss. The optimization objective is

\sum_i \log(1 + \exp(-b_i a_i^T w)) + \lambda ||w||_1 ,

where i indexes the ith row in A and ith entry in b. Each label b_i should have value 1 or -1. BlitzML treats other label values as 1 if the value exceeds zero and -1 otherwise.

Calls to solve and compute_max_l1_penalty use interfaces identical to the same methods in blitzml.LassoProblem.

class blitzml.SparseHuberProblem(A, b)[source]

Class for training sparse linear models with huber loss. The optimization objective is

\sum_i L(a_i^T w, b_i) + \lambda ||w||_1 ,

where

L(a_i^T w, b_i) = \left\{
\begin{array}{lll}
  \tfrac{1}2 (a_i^T w - b_i)^2 & & \text{if}\ |a_i^T w - b_i| < 1  \\[0.4em]
  a_i^T w - b_i - \onehalf     & & \text{if}\ a_i^T w - b_i \geq 1 \\[0.4em]
  b_i - a_i^T w - \onehalf     & & \text{otherwise.}
\end{array} \right.

Here i indexes the ith row in A and ith entry in b.

Calls to solve and compute_max_l1_penalty use interfaces identical to those in blitzml.LassoProblem.

class blitzml.SparseSquaredHingeProblem(A, b)[source]

Class for training sparse linear models with squared hinge loss. The optimization objective is

\sum_i \onehalf (1 - b_i a_i^T w)_+^2 + \lambda ||w||_1 ,

where the “+” subscript denotes the rectifier function. Each label should have value 1 or -1.

Calls to solve and compute_max_l1_penalty use interfaces identical to those in blitzml.LassoProblem.

class blitzml.SparseSmoothedHingeProblem(A, b)[source]

Class for training sparse linear models with smoothed hinge loss. The optimization objective is

\sum_i L(a_i^T w, b_i) + \lambda ||w||_1 ,

where

L(a_i^T w, b_i) = \left\{
\begin{array}{lll}
  \onehalf - b_i a_i^T w       && \text{if}\ b_i a_i^T w < 0       \\[0.4em]
  \onehalf (1 - b_i a_i^T w)^2 && \text{if}\ b_i a_i^T w \in [0,1) \\[0.4em]
  0                            && \text{otherwise.}
\end{array} \right.

Each label should have value 1 or -1.

Solution classes

class blitzml._sparse_linear.LassoSolution[source]

Solution class for LassoProblem.

bias

Value of model’s bias term.

compute_loss(A, b)

Compute the sum

\sum_i L(a_i^T w, b_i) ,

where L is the problem’s loss function.

Parameters:
  • A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix for evaluating loss.
  • b (numpy.ndarray) – Corresponding labels array of length n.
Returns:

loss

Return type:

float

dual_solution

Dual solution to optimization problem.

duality_gap

Duality gap between primal and dual solutions (which upper bounds the suboptimality of these solutions).

objective_value

Objective value of solution.

predict(A)

Predict label values from feature vectors.

Parameters:A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix to make predictions for.
Returns:predictions
Return type:numpy.ndarray of length n.
save(filepath)

Save model to disk.

Parameters:filepath (string) – Location to save solution.
solution_status

Status of Blitz algorithm upon returning model.

weights

Array of model’s weight values.

class blitzml._sparse_linear.SparseLogisticRegressionSolution[source]

Solution class for SparseLogisticRegressionProblem.

Except for the following additional method, interface is identical to LassoSolution’s interface.

predict_probabilities(A)[source]

Predict probability values from feature vectors.

Parameters:A (numpy.ndarray or scipy.sparse matrix) – n x d design matrix to predict probabilities for.
Returns:predicted_probabilities
Return type:numpy.ndarray (length n with values in [0, 1])
class blitzml._sparse_linear.SparseHuberSolution[source]

Solution class for SparseHuberProblem.

Interface is identical to LassoSolution’s interface.

class blitzml._sparse_linear.SparseSquaredHingeSolution[source]

Solution class for SparseSquaredHingeProblem.

Interface is identical to LassoSolution’s interface.

class blitzml._sparse_linear.SparseSmoothedHingeSolution[source]

Solution class for SparseSquaredHingeProblem.

Interface is identical to LassoSolution’s interface.

Utility functions

blitzml.load_solution(filepath)[source]

Load BlitzMLSolution from disk.

Parameters:filepath (string) – Path to saved solution.
Returns:solution
Return type:BlitzMLSolution.
blitzml.parse_log_directory(log_directory)[source]

Parse files logged by BlitzML during a solve call.

Parameters:log_directory (string) – Path to directory containing log files to parse.
Returns:logs – Iterable over information logged by BlitzML. Each item is a dictionary of logged values.
Return type:generator
blitzml.suppress_warnings()[source]

Stop BlitzML from printing warning messages to sys.stdout. By default, warning messages are unsuppressed.

blitzml.unsuppress_warnings()[source]

Allow BlitzML to print warning messages to sys.stdout.