pybaselines.utils.pspline_smooth
- pybaselines.utils.pspline_smooth(data, x_data=None, lam=10.0, num_knots=100, spline_degree=3, diff_order=2, weights=None, check_finite=True)[source]
Smooths the input data using Penalized Spline smoothing.
The input is smoothed by solving the equation
(B.T @ W @ B + lam * D.T @ D) y_smooth = B.T @ W @ y, where W is a matrix with weights on the diagonals, D is the finite difference matrix, and B is the spline basis matrix.- Parameters:
- dataarray_like, shape (N,)
The y-values of the measured data, with N data points.
- x_dataarray_like, shape (N,), optional
The x-values of the measured data. Default is None, which will create an array from -1 to 1 with N points.
- lam
float, optional The smoothing parameter. Larger values will create smoother baselines. Default is 1e1.
- num_knots
int, optional The number of knots for the spline. Default is 100.
- spline_degree
int, optional The degree of the spline. Default is 3, which is a cubic spline.
- diff_order
int, optional The order of the finite difference matrix. Must be greater than or equal to 0. Default is 2 (second order differential matrix). Typical values are 2 or 1.
- weightsarray_like, shape (N,), optional
The weighting array, used to override the function's baseline identification to designate peak points. Only elements with 0 or False values will have an effect; all non-zero values are considered baseline points. If None (default), then will be an array with size equal to N and all values set to 1.
- check_finitebool, optional
If True, will raise an error if any values if data or weights are not finite. Default is False, which skips the check.
- Returns:
- y_smooth
numpy.ndarray, shape (N,) The smoothed data.
tuple(numpy.ndarray,numpy.ndarray,int)A tuple of the spline knots, spline coefficients, and spline degree, which can be used to reconstruct the fit spline. Useful if needing to recreate the spline with different x-values.
- y_smooth
References
Eilers, P., et al. Splines, knots, and penalties. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(6), 637-653.