pybaselines.two_d._spline_utils
Module Contents
Classes
A Penalized Spline, which penalizes the difference of the spline coefficients. |
- class pybaselines.two_d._spline_utils.PSpline2D(x, z, num_knots=100, spline_degree=3, check_finite=False, lam=1, diff_order=2)[source]
A Penalized Spline, which penalizes the difference of the spline coefficients.
Penalized splines (P-Splines) are solved with the following equation
(B.T @ W @ B + P) c = B.T @ W @ y
where c is the spline coefficients, B is the spline basis, the weights are the diagonal of W, the penalty is P, and y is the fit data. The penalty P is usually in the formlam * D.T @ D
, where lam is a penalty factor and D is the matrix version of the finite difference operator.Notes
If the penalty is symmetric, the sparse system could be solved much faster using CHOLMOD from SuiteSparse (https://github.com/DrTimothyAldenDavis/SuiteSparse) through the python bindings provided by scikit-sparse (https://github.com/scikit-sparse/scikit-sparse), but it is not worth implementing here since this code will rarely be used.
References
Eilers, P., et al. Fast and compact smoothing on large multidimensional grids. Computational Statistics and Data Analysis, 2006, 50(1), 61-76.
- Attributes:
- basis_rscipy.sparse.csr.csr_matrix, shape (N, P)
The spline basis for the rows. Has a shape of (N, P), where N is the number of points in x, and P is the number of basis functions (equal to
K - spline_degree - 1
or equivalentlynum_knots[0] + spline_degree[0] - 1
).- basis_cscipy.sparse.csr.csr_matrix, shape (M, Q)
The spline basis for the columns. Has a shape of (M, Q), where M is the number of points in z, and Q is the number of basis functions (equal to
K - spline_degree - 1
or equivalentlynum_knots[1] + spline_degree[1] - 1
).- coefNone or numpy.ndarray, shape (M,)
The spline coefficients. Is None if
solve_pspline()
has not been called at least once.- knots_rnumpy.ndarray, shape (K,)
The knots for the spline along the rows. Has a shape of K, which is equal to
num_knots[0] + 2 * spline_degree[0]
.- knots_cnumpy.ndarray, shape (L,)
The knots for the spline along the columns. Has a shape of L, which is equal to
num_knots[1] + 2 * spline_degree[2]
.- num_knotsnumpy.ndarray([int, int])
The number of internal knots (including the endpoints) for x and z. The total number of knots for the spline, K, is equal to
num_knots + 2 * spline_degree
.- spline_degreenumpy.ndarray([int, int])
The degree of the spline (eg. a cubic spline would have a spline_degree of 3) for x and z.
- xnumpy.ndarray, shape (N,)
The x-values for the spline.
- znumpy.ndarray, shape (M,)
The z-values for the spline.
- property basis
The full spline basis matrix.
This is a lazy implementation since the full basis is typically not needed for computations.
- property tck
The knots, spline coefficients, and spline degree to reconstruct the spline.
Convenience function for easily reconstructing the last solved spline with outside modules, such as with SciPy's NdBSpline, to allow for other usages such as evaulating with different x- and z-values.
- Raises:
- ValueError
Raised if solve_pspline has not been called yet, meaning that the spline has not yet been constructed.
Notes
To use with
scipy.interpolate.NdBSpline
, the setup would look like:from scipy.interpolate import NdBspline pspline = Pspline2D(x, z, ...) pspline_fit = pspline.solve(...) XZ = np.array(np.meshgrid(x, z)).T # same as zipping the meshgrid and rearranging fit = NdBSpline(pspline.tck)(XZ) # fit == pspline_fit
- add_diagonal(value)
Adds a diagonal array to the original penalty matrix.
- Parameters:
- valuefloat or numpy.ndarray
The diagonal array to add to the penalty matrix.
- Returns:
- scipy.sparse.base.spmatrix
The penalty matrix with the main diagonal updated.
- add_penalty(penalty)
Updates self.penalty with an additional penalty and updates the bands.
- Parameters:
- penaltyarray-like
The additional penalty to add to self.penalty.
- Returns:
- numpy.ndarray
The updated self.penalty.
- reset_diagonal()
Sets the main diagonal of the penalty matrix back to its original value.
- reset_diagonals(lam=1, diff_order=2)
Resets the diagonals of the system and all of the attributes.
Useful for reusing the penalized system for a different lam value.
- Parameters:
- lamfloat or Sequence[int, int], optional
The penalty factor applied to the difference matrix for the rows and columns, respectively. If a single value is given, both will use the same value. Larger values produce smoother results. Must be greater than 0. Default is 1.
- diff_orderint or Sequence[int, int], optional
The difference order of the penalty for the rows and columns, respectively. If a single value is given, both will use the same value. Default is 2 (second order difference).
- reset_penalty(lam=1, diff_order=2)[source]
Resets the penalty of the system and all of the attributes.
Useful for reusing the penalty diagonals without having to recalculate the spline basis.
- Parameters:
- lamfloat or Sequence[float, float], optional
The penalty factor applied to the difference matrix. Larger values produce smoother results. Must be greater than 0. Default is 1.
- diff_orderint or Sequence[int, int], optional
The difference order of the penalty. Default is 2 (second order difference).
- same_basis(num_knots=100, spline_degree=3)[source]
Sees if the current basis is equivalent to the input number of knots of spline degree.
- Parameters:
- num_knotsint or Sequence[int, int], optional
The number of knots for the new spline. Default is 100.
- spline_degreeint or Sequence[int, int], optional
The degree of the new spline. Default is 3.
- Returns:
- bool
True if the input number of knots and spline degree are equivalent to the current spline basis of the object.
- solve(y, weights, penalty=None, rhs_extra=None)[source]
Solves the coefficients for a weighted penalized spline.
Solves the linear equation
(B.T @ W @ B + P) c = B.T @ W @ y
for the spline coefficients, c, given the spline basis, B, the weights (diagonal of W), the penalty P, and y, and returns the resulting spline,B @ c
. Attempts to calculateB.T @ W @ B
andB.T @ W @ y
as a banded system to speed up the calculation.- Parameters:
- ynumpy.ndarray, shape (M, N)
The y-values for fitting the spline.
- weightsnumpy.ndarray, shape (M, N)
The weights for each y-value.
- penaltynumpy.ndarray, shape (
M * N
,M * N
) The finite difference penalty matrix, in LAPACK's lower banded format (see
scipy.linalg.solveh_banded()
) if lower_only is True or the full banded format (seescipy.linalg.solve_banded()
) if lower_only is False.- rhs_extrafloat or numpy.ndarray, shape (
M * N
,), optional If supplied, rhs_extra will be added to the right hand side (
B.T @ W @ y
) of the equation before solving. Default is None, which adds nothing.
- Returns:
- numpy.ndarray, shape (M, N)
The spline, corresponding to
B @ c
, where c are the solved spline coefficients and B is the spline basis.
Notes
Uses the more efficient algorithm from Eilers's paper, although the memory usage is higher than the straigtforward method when the number of knots is high; however, it is significantly faster and memory efficient when the number of knots is lower, which will be the more typical use case.