pybaselines._spline_utils
Module Contents
Classes
A Penalized Spline, which penalizes the difference of the spline coefficients. |
- class pybaselines._spline_utils.PSpline(x, num_knots=100, spline_degree=3, check_finite=False, lam=1, diff_order=2, allow_lower=True, reverse_diags=False)[source]
A Penalized Spline, which penalizes the difference of the spline coefficients.
Penalized splines (P-Splines) are solved with the following equation
(B.T @ W @ B + P) c = B.T @ W @ y
where c is the spline coefficients, B is the spline basis, the weights are the diagonal of W, the penalty is P, and y is the fit data. The penalty P is usually in the formlam * D.T @ D
, where lam is a penalty factor and D is the matrix version of the finite difference operator.References
Eilers, P., et al. Twenty years of P-splines. SORT: Statistics and Operations Research Transactions, 2015, 39(2), 149-186.
Eilers, P., et al. Splines, knots, and penalties. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(6), 637-653.
- Attributes:
- basisscipy.sparse.csr.csr_matrix, shape (N, M)
The spline basis. Has a shape of (N, M), where N is the number of points in x, and M is the number of basis functions (equal to
K - spline_degree - 1
or equivalentlynum_knots + spline_degree - 1
).- coefNone or numpy.ndarray, shape (M,)
The spline coefficients. Is None if
solve_pspline()
has not been called at least once.- knotsnumpy.ndarray, shape (K,)
The knots for the spline. Has a shape of K, which is equal to
num_knots + 2 * spline_degree
.- num_knotsint
The number of internal knots (including the endpoints). The total number of knots for the spline, K, is equal to
num_knots + 2 * spline_degree
.- spline_degreeint
The degree of the spline (eg. a cubic spline would have a spline_degree of 3).
- xnumpy.ndarray, shape (N,)
The x-values for the spline.
- property tck
The knots, spline coefficients, and spline degree to reconstruct the spline.
Convenience function for easily reconstructing the last solved spline with outside modules, such as with SciPy's BSpline, to allow for other usages such as evaulating with different x-values.
- Raises:
- ValueError
Raised if solve_pspline has not been called yet, meaning that the spline has not yet been constructed.
- add_diagonal(value)
Adds a diagonal array or float to the original penalty matrix.
- Parameters:
- valuefloat or numpy.ndarray
The number or array to add to the main diagonal of the penalty.
- Returns:
- numpy.ndarray
The penalty with the main diagonal updated.
- add_penalty(penalty)
Updates self.penalty with an additional penalty and updates the bands.
- Parameters:
- penaltyarray-like
The additional penalty to add to self.penalty.
- Returns:
- numpy.ndarray
The updated self.penalty.
- reset_diagonals(lam=1, diff_order=2, allow_lower=True, reverse_diags=None, allow_pentapy=True, padding=0)
Resets the diagonals of the system and all of the attributes.
Useful for reusing the penalized system for a different lam value.
- Parameters:
- lamfloat, optional
The penalty factor applied to the difference matrix. Larger values produce smoother results. Must be greater than 0. Default is 1.
- diff_orderint, optional
The difference order of the penalty. Default is 2 (second order difference).
- allow_lowerbool, optional
If True (default), will allow only using the lower bands of the penalty matrix, which allows using
scipy.linalg.solveh_banded()
instead of the slightly slowerscipy.linalg.solve_banded()
.- reverse_diags{None, False, True}, optional
If True, will reverse the order of the diagonals of the squared difference matrix. If False, will never reverse the diagonals. If None (default), will only reverse the diagonals if using pentapy's solver.
- allow_pentapybool, optional
If True (default), will allow using pentapy's solver if diff_order is 2 and pentapy is installed. pentapy's solver is faster than scipy's banded solvers.
- paddingint, optional
The number of extra layers of zeros to add to the bottom and potentially the top if the full bands are used. Default is 0, which adds no extra layers. Negative padding is treated as equivalent to 0.
- reset_penalty_diagonals(lam=1, diff_order=2, allow_lower=True, reverse_diags=False)[source]
Resets the penalty diagonals of the system and all of the attributes.
Useful for reusing the penalty diagonals without having to recalculate the spline basis.
- Parameters:
- lamfloat, optional
The penalty factor applied to the difference matrix. Larger values produce smoother results. Must be greater than 0. Default is 1.
- diff_orderint, optional
The difference order of the penalty. Default is 2 (second order difference).
- allow_lowerbool, optional
If True (default), will allow only using the lower bands of the penalty matrix, which allows using
scipy.linalg.solveh_banded()
instead of the slightly slowerscipy.linalg.solve_banded()
.- reverse_diagsbool, optional
If True, will reverse the order of the diagonals of the squared difference matrix. If False (default), will never reverse the diagonals.
Notes
allow_pentapy is always set to False since the time needed to go from a lower to full banded matrix and shifting the rows removes any speedup from using pentapy's solver. It also reduces the complexity of setting up the equations.
Adds padding to the penalty diagonals to accomodate the different shapes of the spline basis and the penalty to speed up calculations when the two are added.
- reverse_penalty()
Reverses the penalty and original diagonals for the system.
- Raises:
- ValueError
Raised if self.lower is True, since reversing the half diagonals does not make physical sense.
- same_basis(num_knots=100, spline_degree=3)[source]
Sees if the current basis is equivalent to the input number of knots of spline degree.
- Parameters:
- num_knotsint, optional
The number of knots for the new spline. Default is 100.
- spline_degreeint, optional
The degree of the new spline. Default is 3.
- Returns:
- bool
True if the input number of knots and spline degree are equivalent to the current spline basis of the object.
- solve(lhs, rhs, overwrite_ab=False, overwrite_b=False, check_finite=False, l_and_u=None, check_output=False)
Solves the equation
A @ x = rhs
, given A in banded format as lhs.- Parameters:
- lhsarray-like, shape (M, N)
The left-hand side of the equation, in banded format. lhs is assumed to be some slight modification of self.penalty in the same format (reversed, lower, number of bands, etc. are all the same).
- rhsarray-like, shape (N,)
The right-hand side of the equation.
- overwrite_abbool, optional
Whether to overwrite lhs when using
scipy.linalg.solveh_banded()
orscipy.linalg.solve_banded()
. Default is False.- overwrite_bbool, optional
Whether to overwrite rhs when using
scipy.linalg.solveh_banded()
orscipy.linalg.solve_banded()
. Default is False.- check_finitebool, optional
Whether to check if the inputs are finite when using
scipy.linalg.solveh_banded()
orscipy.linalg.solve_banded()
. Default is False.- l_and_uContainer(int, int), optional
The number of lower and upper bands in lhs when using
scipy.linalg.solve_banded()
. Default is None, which uses (len(lhs) // 2
,len(lhs) // 2
).- check_outputbool, optional
If True, will check the output for non-finite values when using
_pentapy_solver()
. Default is False.
- Returns:
- outputnumpy.ndarray, shape (N,)
The solution to the linear system, x.
- solve_pspline(y, weights, penalty=None, rhs_extra=None)[source]
Solves the coefficients for a weighted penalized spline.
Solves the linear equation
(B.T @ W @ B + P) c = B.T @ W @ y
for the spline coefficients, c, given the spline basis, B, the weights (diagonal of W), the penalty P, and y, and returns the resulting spline,B @ c
. Attempts to calculateB.T @ W @ B
andB.T @ W @ y
as a banded system to speed up the calculation.- Parameters:
- ynumpy.ndarray, shape (N,)
The y-values for fitting the spline.
- weightsnumpy.ndarray, shape (N,)
The weights for each y-value.
- penaltynumpy.ndarray, shape (D, N)
The finite difference penalty matrix, in LAPACK's lower banded format (see
scipy.linalg.solveh_banded()
) if lower_only is True or the full banded format (seescipy.linalg.solve_banded()
) if lower_only is False.- rhs_extrafloat or numpy.ndarray, shape (N,), optional
If supplied, rhs_extra will be added to the right hand side (
B.T @ W @ y
) of the equation before solving. Default is None, which adds nothing.
- Returns:
- numpy.ndarray, shape (N,)
The spline, corresponding to
B @ c
, where c are the solved spline coefficients and B is the spline basis.