pybaselines._banded_utils
Module Contents
Classes
An object for setting up and solving banded penalized least squares linear systems. 
Functions
Creates the diagonals of the finite difference penalty matrix. 

Creates the finite difference penalty matrix. 

Creates an norder finitedifference matrix. 
 class pybaselines._banded_utils.PenalizedSystem(data_size, lam=1, diff_order=2, allow_lower=True, reverse_diags=None, allow_pentapy=True, padding=0, pentapy_solver=2)[source]
An object for setting up and solving banded penalized least squares linear systems.
 Attributes:
 diff_orderint
The difference order of the penalty.
 lowerbool
If True, the penalty uses only the lower bands of the symmetric banded penalty. Will use
scipy.linalg.solveh_banded()
for solving. If False, contains both the upper and lower bands of the penalty and will use eitherscipy.linalg.solve_banded()
(if using_pentapy is False) or_pentapy_solver()
when solving. main_diagonalnumpy.ndarray
The main diagonal of the penalty matrix. Is updated when adding additional matrices to the penalty through
add_penalty()
, and takes into account whether the penalty is only the lower bands or the total bands. main_diagonal_indexint
The index of the main diagonal for penalty. Is updated when adding additional matrices to the penalty through
add_penalty()
, and takes into account whether the penalty is only the lower bands or the total bands. num_bandsint
The number of bands in the penalty. The number of bands is assumbed to be symmetric, so the number of upper and lower bands should both be equal to num_bands.
 original_diagonalsnumpy.ndarray
The original penalty diagonals before multiplying by lam or adding any padding. Maintained so that repeated computations with different lam values can be quickly set up. original_diagonals can be either the full or lower bands of the penalty, and may be reveresed, it depends on the set up. Reset by calling
reset_diagonals()
. penaltynumpy.ndarray
The current penalty. Originally is original_diagonals after multiplying by lam and applying padding, but can also be changed by calling
add_penalty()
. Reset by callingreset_diagonals()
. pentapy_solverint or str
The integer or string designating which solver to use if using pentapy. See
pentapy.solve()
for available options, although 1 or 2 are the most relevant options. Default is 2. reversedbool
If True, the penalty is reversed of the typical LAPACK banded format. Useful if multiplying the penalty with an array since the rows get shifted, or if using pentapy's solver.
 using_pentapybool
If True, will use pentapy's solver when solving.
 add_diagonal(value)[source]
Adds a diagonal array or float to the original penalty matrix.
 Parameters:
 valuefloat or numpy.ndarray
The number or array to add to the main diagonal of the penalty.
 Returns:
 numpy.ndarray
The penalty with the main diagonal updated.
 add_penalty(penalty)[source]
Updates self.penalty with an additional penalty and updates the bands.
 Parameters:
 penaltyarraylike
The additional penalty to add to self.penalty.
 Returns:
 numpy.ndarray
The updated self.penalty.
 reset_diagonals(lam=1, diff_order=2, allow_lower=True, reverse_diags=None, allow_pentapy=True, padding=0)[source]
Resets the diagonals of the system and all of the attributes.
Useful for reusing the penalized system for a different lam value.
 Parameters:
 lamfloat, optional
The penalty factor applied to the difference matrix. Larger values produce smoother results. Must be greater than 0. Default is 1.
 diff_orderint, optional
The difference order of the penalty. Default is 2 (second order difference).
 allow_lowerbool, optional
If True (default), will allow only using the lower bands of the penalty matrix, which allows using
scipy.linalg.solveh_banded()
instead of the slightly slowerscipy.linalg.solve_banded()
. reverse_diags{None, False, True}, optional
If True, will reverse the order of the diagonals of the squared difference matrix. If False, will never reverse the diagonals. If None (default), will only reverse the diagonals if using pentapy's solver.
 allow_pentapybool, optional
If True (default), will allow using pentapy's solver if diff_order is 2 and pentapy is installed. pentapy's solver is faster than scipy's banded solvers.
 paddingint, optional
The number of extra layers of zeros to add to the bottom and potentially the top if the full bands are used. Default is 0, which adds no extra layers. Negative padding is treated as equivalent to 0.
 reverse_penalty()[source]
Reverses the penalty and original diagonals for the system.
 Raises:
 ValueError
Raised if self.lower is True, since reversing the half diagonals does not make physical sense.
 solve(lhs, rhs, overwrite_ab=False, overwrite_b=False, check_finite=False, l_and_u=None, check_output=False)[source]
Solves the equation
A @ x = rhs
, given A in banded format as lhs. Parameters:
 lhsarraylike, shape (M, N)
The lefthand side of the equation, in banded format. lhs is assumed to be some slight modification of self.penalty in the same format (reversed, lower, number of bands, etc. are all the same).
 rhsarraylike, shape (N,)
The righthand side of the equation.
 overwrite_abbool, optional
Whether to overwrite lhs when using
scipy.linalg.solveh_banded()
orscipy.linalg.solve_banded()
. Default is False. overwrite_bbool, optional
Whether to overwrite rhs when using
scipy.linalg.solveh_banded()
orscipy.linalg.solve_banded()
. Default is False. check_finitebool, optional
Whether to check if the inputs are finite when using
scipy.linalg.solveh_banded()
orscipy.linalg.solve_banded()
. Default is False. l_and_uContainer(int, int), optional
The number of lower and upper bands in lhs when using
scipy.linalg.solve_banded()
. Default is None, which uses (len(lhs) // 2
,len(lhs) // 2
). check_outputbool, optional
If True, will check the output for nonfinite values when using
_pentapy_solver()
. Default is False.
 Returns:
 outputnumpy.ndarray, shape (N,)
The solution to the linear system, x.
 pybaselines._banded_utils.diff_penalty_diagonals(data_size, diff_order=2, lower_only=True, padding=0)[source]
Creates the diagonals of the finite difference penalty matrix.
If D is the finite difference matrix, then the finite difference penalty matrix is defined as
D.T @ D
. The penalty matrix is banded and symmetric, so the nonzero diagonal bands can be computed efficiently. Parameters:
 data_sizeint
The number of data points.
 diff_orderint, optional
The integer differential order; must be >= 0. Default is 2.
 lower_onlybool, optional
If True (default), will return only the lower diagonals of the matrix. If False, will include all diagonals of the matrix.
 paddingint, optional
The number of extra layers of zeros to add to the bottom and top, if lower_only is True. Useful if working with other diagonal arrays with a different number of rows. Default is 0, which adds no extra layers. Negative padding is treated as equivalent to 0.
 Returns:
 diagonalsnumpy.ndarray
The diagonals of the finite difference penalty matrix.
 Raises:
 ValueError
Raised if diff_order is negative or if data_size less than 1.
Notes
Equivalent to calling:
from pybaselines.utils import difference_matrix diff_matrix = difference_matrix(data_size, diff_order) output = (diff_matrix.T @ diff_matrix).todia().data[::1] if lower_only: output = output[diff_order:]
but is several orders of magnitude times faster.
The data is output in the banded format required by SciPy's solve_banded and solveh_banded functions.
 pybaselines._banded_utils.diff_penalty_matrix(data_size, diff_order=2, diff_format='csr')[source]
Creates the finite difference penalty matrix.
If D is the finite difference matrix, then the finite difference penalty matrix is defined as
D.T @ D
. Parameters:
 data_sizeint
The number of data points.
 diff_orderint, optional
The integer differential order; must be >= 0. Default is 2.
 diff_formatstr or None, optional
The sparse format to use for the difference matrix. Default is 'csr'.
 Returns:
 penalty_matrixscipy.sparse.spmatrix or scipy.sparse._sparray
The sparse difference penalty matrix.
 Raises:
 ValueError
Raised if diff_order is greater or equal to data_size.
Notes
Equivalent to calling:
from pybaselines.utils import difference_matrix diff_matrix = difference_matrix(data_size, diff_order) penalty_matrix = diff_matrix.T @ diff_matrix
but should be faster since the bands within the penalty matrix can be gotten without the matrix multiplication.
 pybaselines._banded_utils.difference_matrix(data_size, diff_order=2, diff_format=None)[source]
Creates an norder finitedifference matrix.
 Parameters:
 data_sizeint
The number of data points.
 diff_orderint, optional
The integer differential order; must be >= 0. Default is 2.
 diff_formatstr or None, optional
The sparse format to use for the difference matrix. Default is None, which will use the default specified in
scipy.sparse.diags()
.
 Returns:
 diff_matrixscipy.sparse.spmatrix or scipy.sparse._sparray
The sparse difference matrix.
 Raises:
 ValueError
Raised if diff_order or data_size is negative.
Notes
The resulting matrices are sparse versions of:
import numpy as np np.diff(np.eye(data_size), diff_order, axis=0)
This implementation allows using the differential matrices are they are written in various publications, ie.
D.T @ D
.Most baseline algorithms use 2nd order differential matrices when doing penalized least squared fitting or Whittakersmoothingbased fitting.