pybaselines.Baseline.fabc

Baseline.fabc(data, lam=1000000.0, scale=None, num_std=3.0, diff_order=2, min_length=2, weights=None, weights_as_mask=False, pad_kwargs=None, **kwargs)[source]

Fully automatic baseline correction (fabc).

Similar to Dietrich's method, except that the derivative is estimated using a continuous wavelet transform and the baseline is calculated using Whittaker smoothing through the identified baseline points.

Parameters:
dataarray_like, shape (N,)

The y-values of the measured data, with N data points.

lamfloat, optional

The smoothing parameter. Larger values will create smoother baselines. Default is 1e6.

scaleint, optional

The scale at which to calculate the continuous wavelet transform. Should be approximately equal to the index-based full-width-at-half-maximum of the peaks or features in the data. Default is None, which will use half of the value from optimize_window(), which is not always a good value, but at least scales with the number of data points and gives a starting point for tuning the parameter.

num_stdfloat, optional

The number of standard deviations to include when thresholding. Higher values will assign more points as baseline. Default is 3.0.

diff_orderint, optional

The order of the differential matrix. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 2 or 1.

min_lengthint, optional

Any region of consecutive baseline points less than min_length is considered to be a false positive and all points in the region are converted to peak points. A higher min_length ensures less points are falsely assigned as baseline points. Default is 2, which only removes lone baseline points.

weightsarray_like, shape (N,), optional

The weighting array, used to override the function's baseline identification to designate peak points. Only elements with 0 or False values will have an effect; all non-zero values are considered baseline points. If None (default), then will be an array with size equal to N and all values set to 1.

weights_as_maskbool, optional

If True, signifies that the input weights is the mask to use for fitting, which skips the continuous wavelet calculation and just smooths the input data. Default is False.

pad_kwargsdict, optional

A dictionary of keyword arguments to pass to pad_edges() for padding the edges of the data to prevent edge effects from convolution for the continuous wavelet transform. Default is None.

**kwargs

Deprecated since version 1.2.0: Passing additional keyword arguments is deprecated and will be removed in version 1.4.0. Pass keyword arguments using pad_kwargs.

Returns:
baselinenumpy.ndarray, shape (N,)

The calculated baseline.

paramsdict

A dictionary with the following items:

  • 'mask': numpy.ndarray, shape (N,)

    The boolean array designating baseline points as True and peak points as False.

  • 'weights': numpy.ndarray, shape (N,)

    The weight array used for fitting the data.

Notes

The classification of baseline points is similar to dietrich(), except that this method approximates the first derivative using a continous wavelet transform with the Haar wavelet, which is more robust than the numerical derivative in Dietrich's method.

References

Cobas, J., et al. A new general-purpose fully automatic baseline-correction procedure for 1D and 2D NMR data. Journal of Magnetic Resonance, 2006, 183(1), 145-151.