pybaselines.Baseline.mpspline
- Baseline.mpspline(data, half_window=None, lam=10000.0, lam_smooth=0.01, p=0.0, num_knots=100, spline_degree=3, diff_order=2, weights=None, pad_kwargs=None, window_kwargs=None, **kwargs)[source]
Morphology-based penalized spline baseline.
Identifies baseline points using morphological operations, and then uses weighted least-squares to fit a penalized spline to the baseline.
- Parameters:
- dataarray_like, shape (N,)
The y-values of the measured data, with N data points.
- half_window
int, optional The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()and window_kwargs.- lam
float, optional The smoothing parameter for the penalized spline when fitting the baseline. Larger values will create smoother baselines. Default is 1e4. Larger values are needed for larger num_knots.
- lam_smooth
float, optional The smoothing parameter for the penalized spline when smoothing the input data. Default is 1e-2. Larger values are needed for noisy data or for larger num_knots.
- p
float, optional The penalizing weighting factor. Must be between 0 and 1. Anchor points identified by the procedure in the reference are given a weight of 1 - p, and all other points have a weight of p. Default is 0.0.
- num_knots
int, optional The number of knots for the spline. Default is 100.
- spline_degree
int, optional The degree of the spline. Default is 3, which is a cubic spline.
- diff_order
int, optional The order of the differential matrix. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 2 or 3.
- weightsarray_like, shape (N,), optional
The weighting array. If None (default), then the weights will be calculated following the procedure in the reference.
- window_kwargs
dict, optional A dictionary of keyword arguments to pass to
optimize_window()for estimating the half window if half_window is None. Default is None.- **kwargs
Deprecated since version 1.2.0: Passing additional keyword arguments is deprecated and will be removed in version 1.4.0. Pass keyword arguments using window_kwargs.
- Returns:
- baseline
numpy.ndarray, shape (N,) The calculated baseline.
- params
dict A dictionary with the following items:
- 'weights': numpy.ndarray, shape (N,)
The weight array used for fitting the data.
- 'half_window': int
The half window used for the morphological calculations.
- baseline
- Raises:
ValueErrorRaised if half_window is < 1, if lam or lam_smooth is <= 0, or if p is not between 0 and 1.
Notes
The optimal opening is calculated as the element-wise minimum of the opening and the average of the erosion and dilation of the opening. The reference used the erosion and dilation of the smoothed data, rather than the opening, which tends to overestimate the baseline.
Rather than setting knots at the intersection points of the optimal opening and the smoothed data as described in the reference, weights are assigned to 1 - p at the intersection points and p elsewhere. This simplifies the penalized spline calculation by allowing the use of equally spaced knots, but should otherwise give similar results as the reference algorithm.
References
Gonzalez-Vidal, J., et al. Automatic morphology-based cubic p-spline fitting methodology for smoothing and baseline-removal of Raman spectra. Journal of Raman Spectroscopy. 2017, 48(6), 878-883.