pybaselines.Baseline.peak_filling

Baseline.peak_filling(data, half_window=None, sections=None, max_iter=5, lam_smooth=None)[source]

The 4S (Smooth, Subsample, Suppress, Stretch) Peak Filling algorithm.

Smooths and truncates the input. Each value is then replaced in-place by the minimum of the value or the average of the moving window, with the half-window size decreasing exponentially from the input half_window to 1. The result is then interpolated back into the original data size.

Parameters:
dataarray_like, shape (N,)

The y-values of the measured data, with N data points.

half_windowint, optional

The index-based size to use for the moving average window. The total window size will range from [-half_window, ..., half_window] with size 2 * half_window + 1. Default is None, which will use two or three times the output from func:.optimize_window, which is an okay starting value.

sectionsint or sequence[int, ...], optional

If the input is an integer, it sets the number of equally sized segments the data will be split into. If the input is a sequence, each integer in the sequence will be the index that splits two segments, which allows constructing unequally sized segments. The minimum of each section will be used to represent the input data for determining the baseline. Higher sections values are needed for baselines with higher curvature. Default is None, which will use N // 10.

max_iterint, optional

The number of iterations to perform smoothing. Each iteration, the size of the window used for the moving average will shrink logarithmically, starting at 2 * half_window + 1 and ending at 3. Default is 5.

lam_smoothfloat or None, optional

The parameter for smoothing the input using Whittaker smoothing. Set to 0 or None (default) to skip smoothing.

Returns:
baselinenumpy.ndarray, shape (N,)

The calculated baseline.

paramsdict

A dictionary with the following items:

  • 'x_fit': numpy.ndarray, shape (P,)

    The truncated x-values used for fitting and interpolating the baseline.

  • 'baseline_fit': numpy.ndarray, shape (P,)

    The truncated baseline values used to interpolate the final baseline.

Raises:
TypeError

Raised if sections is an integer not between 1 and N, or if sections is a sequence with any value not between 0 and N - 1.

Notes

The input parameter sections will determine the necessary half_window and max_iter values required to correctly fit the baseline. Likewise, max_iter is highly correlated with half_window.

References

Liland, K. 4S Peak Filling - baseline estimation by iterative mean suppression. MethodsX. 2015, 2, 135-140.