pybaselines.two_d._spline_utils

Module Contents

Classes

PSpline2D

A Penalized Spline, which penalizes the difference of the spline coefficients.

class pybaselines.two_d._spline_utils.PSpline2D(x, z, num_knots=100, spline_degree=3, check_finite=False, lam=1, diff_order=2)[source]

A Penalized Spline, which penalizes the difference of the spline coefficients.

Penalized splines (P-Splines) are solved with the following equation (B.T @ W @ B + P) c = B.T @ W @ y where c is the spline coefficients, B is the spline basis, the weights are the diagonal of W, the penalty is P, and y is the fit data. The penalty P is usually in the form lam * D.T @ D, where lam is a penalty factor and D is the matrix version of the finite difference operator.

Notes

If the penalty is symmetric, the sparse system could be solved much faster using CHOLMOD from SuiteSparse (https://github.com/DrTimothyAldenDavis/SuiteSparse) through the python bindings provided by scikit-sparse (https://github.com/scikit-sparse/scikit-sparse), but it is not worth implementing here since this code will rarely be used.

References

Eilers, P., et al. Fast and compact smoothing on large multidimensional grids. Computational Statistics and Data Analysis, 2006, 50(1), 61-76.

Attributes:
basis_rscipy.sparse.csr.csr_matrix, shape (N, P)

The spline basis for the rows. Has a shape of (N, P), where N is the number of points in x, and P is the number of basis functions (equal to K - spline_degree - 1 or equivalently num_knots[0] + spline_degree[0] - 1).

basis_cscipy.sparse.csr.csr_matrix, shape (M, Q)

The spline basis for the columns. Has a shape of (M, Q), where M is the number of points in z, and Q is the number of basis functions (equal to K - spline_degree - 1 or equivalently num_knots[1] + spline_degree[1] - 1).

coefNone or numpy.ndarray, shape (M,)

The spline coefficients. Is None if solve_pspline() has not been called at least once.

knots_rnumpy.ndarray, shape (K,)

The knots for the spline along the rows. Has a shape of K, which is equal to num_knots[0] + 2 * spline_degree[0].

knots_cnumpy.ndarray, shape (L,)

The knots for the spline along the columns. Has a shape of L, which is equal to num_knots[1] + 2 * spline_degree[2].

num_knotsnumpy.ndarray([int, int])

The number of internal knots (including the endpoints) for x and z. The total number of knots for the spline, K, is equal to num_knots + 2 * spline_degree.

spline_degreenumpy.ndarray([int, int])

The degree of the spline (eg. a cubic spline would have a spline_degree of 3) for x and z.

xnumpy.ndarray, shape (N,)

The x-values for the spline.

znumpy.ndarray, shape (M,)

The z-values for the spline.

property basis

The full spline basis matrix.

This is a lazy implementation since the full basis is typically not needed for computations.

property tck

The knots, spline coefficients, and spline degree to reconstruct the spline.

Convenience function for easily reconstructing the last solved spline with outside modules, such as with SciPy's NdBSpline, to allow for other usages such as evaulating with different x- and z-values.

Raises:
ValueError

Raised if solve_pspline has not been called yet, meaning that the spline has not yet been constructed.

Notes

To use with scipy.interpolate.NdBSpline, the setup would look like:

from scipy.interpolate import NdBspline pspline = Pspline2D(x, z, ...) pspline_fit = pspline.solve(...) XZ = np.array(np.meshgrid(x, z)).T # same as zipping the meshgrid and rearranging fit = NdBSpline(pspline.tck)(XZ) # fit == pspline_fit

add_diagonal(value)

Adds a diagonal array to the original penalty matrix.

Parameters:
valuefloat or numpy.ndarray

The diagonal array to add to the penalty matrix.

Returns:
scipy.sparse.base.spmatrix

The penalty matrix with the main diagonal updated.

add_penalty(penalty)

Updates self.penalty with an additional penalty and updates the bands.

Parameters:
penaltyarray-like

The additional penalty to add to self.penalty.

Returns:
numpy.ndarray

The updated self.penalty.

reset_diagonal()

Sets the main diagonal of the penalty matrix back to its original value.

reset_diagonals(lam=1, diff_order=2)

Resets the diagonals of the system and all of the attributes.

Useful for reusing the penalized system for a different lam value.

Parameters:
lamfloat or Sequence[int, int], optional

The penalty factor applied to the difference matrix for the rows and columns, respectively. If a single value is given, both will use the same value. Larger values produce smoother results. Must be greater than 0. Default is 1.

diff_orderint or Sequence[int, int], optional

The difference order of the penalty for the rows and columns, respectively. If a single value is given, both will use the same value. Default is 2 (second order difference).

reset_penalty(lam=1, diff_order=2)[source]

Resets the penalty of the system and all of the attributes.

Useful for reusing the penalty diagonals without having to recalculate the spline basis.

Parameters:
lamfloat or Sequence[float, float], optional

The penalty factor applied to the difference matrix. Larger values produce smoother results. Must be greater than 0. Default is 1.

diff_orderint or Sequence[int, int], optional

The difference order of the penalty. Default is 2 (second order difference).

same_basis(num_knots=100, spline_degree=3)[source]

Sees if the current basis is equivalent to the input number of knots of spline degree.

Parameters:
num_knotsint or Sequence[int, int], optional

The number of knots for the new spline. Default is 100.

spline_degreeint or Sequence[int, int], optional

The degree of the new spline. Default is 3.

Returns:
bool

True if the input number of knots and spline degree are equivalent to the current spline basis of the object.

solve(y, weights, penalty=None, rhs_extra=None)[source]

Solves the coefficients for a weighted penalized spline.

Solves the linear equation (B.T @ W @ B + P) c = B.T @ W @ y for the spline coefficients, c, given the spline basis, B, the weights (diagonal of W), the penalty P, and y, and returns the resulting spline, B @ c. Attempts to calculate B.T @ W @ B and B.T @ W @ y as a banded system to speed up the calculation.

Parameters:
ynumpy.ndarray, shape (M, N)

The y-values for fitting the spline.

weightsnumpy.ndarray, shape (M, N)

The weights for each y-value.

penaltynumpy.ndarray, shape (M * N, M * N)

The finite difference penalty matrix, in LAPACK's lower banded format (see scipy.linalg.solveh_banded()) if lower_only is True or the full banded format (see scipy.linalg.solve_banded()) if lower_only is False.

rhs_extrafloat or numpy.ndarray, shape (M * N,), optional

If supplied, rhs_extra will be added to the right hand side (B.T @ W @ y) of the equation before solving. Default is None, which adds nothing.

Returns:
numpy.ndarray, shape (M, N)

The spline, corresponding to B @ c, where c are the solved spline coefficients and B is the spline basis.

Notes

Uses the more efficient algorithm from Eilers's paper, although the memory usage is higher than the straigtforward method when the number of knots is high; however, it is significantly faster and memory efficient when the number of knots is lower, which will be the more typical use case.