pybaselines.Baseline.mpspline

Baseline.mpspline(data, half_window=None, lam=10000.0, lam_smooth=0.01, p=0.0, num_knots=100, spline_degree=3, diff_order=2, weights=None, pad_kwargs=None, window_kwargs=None, **kwargs)[source]

Morphology-based penalized spline baseline.

Identifies baseline points using morphological operations, and then uses weighted least-squares to fit a penalized spline to the baseline.

Parameters:
dataarray_like, shape (N,)

The y-values of the measured data, with N data points.

half_windowint, optional

The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using optimize_window() and window_kwargs.

lamfloat, optional

The smoothing parameter for the penalized spline when fitting the baseline. Larger values will create smoother baselines. Default is 1e4. Larger values are needed for larger num_knots.

lam_smoothfloat, optional

The smoothing parameter for the penalized spline when smoothing the input data. Default is 1e-2. Larger values are needed for noisy data or for larger num_knots.

pfloat, optional

The penalizing weighting factor. Must be between 0 and 1. Anchor points identified by the procedure in the reference are given a weight of 1 - p, and all other points have a weight of p. Default is 0.0.

num_knotsint, optional

The number of knots for the spline. Default is 100.

spline_degreeint, optional

The degree of the spline. Default is 3, which is a cubic spline.

diff_orderint, optional

The order of the differential matrix. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 2 or 3.

weightsarray_like, shape (N,), optional

The weighting array. If None (default), then the weights will be calculated following the procedure in the reference.

window_kwargsdict, optional

A dictionary of keyword arguments to pass to optimize_window() for estimating the half window if half_window is None. Default is None.

**kwargs

Deprecated since version 1.2.0: Passing additional keyword arguments is deprecated and will be removed in version 1.4.0. Pass keyword arguments using window_kwargs.

Returns:
baselinenumpy.ndarray, shape (N,)

The calculated baseline.

paramsdict

A dictionary with the following items:

  • 'weights': numpy.ndarray, shape (N,)

    The weight array used for fitting the data.

  • 'half_window': int

    The half window used for the morphological calculations.

Raises:
ValueError

Raised if half_window is < 1, if lam or lam_smooth is <= 0, or if p is not between 0 and 1.

Notes

The optimal opening is calculated as the element-wise minimum of the opening and the average of the erosion and dilation of the opening. The reference used the erosion and dilation of the smoothed data, rather than the opening, which tends to overestimate the baseline.

Rather than setting knots at the intersection points of the optimal opening and the smoothed data as described in the reference, weights are assigned to 1 - p at the intersection points and p elsewhere. This simplifies the penalized spline calculation by allowing the use of equally spaced knots, but should otherwise give similar results as the reference algorithm.

References

Gonzalez-Vidal, J., et al. Automatic morphology-based cubic p-spline fitting methodology for smoothing and baseline-removal of Raman spectra. Journal of Raman Spectroscopy. 2017, 48(6), 878-883.