Smoothing Baselines

The contents of pybaselines.smooth contain algorithms that use smoothing to eliminate peaks and leave only the baseline.


The window size used for smoothing-based algorithms is index-based, rather than based on the units of the data, so proper conversions must be done by the user to get the desired window size.


noise_median (Noise Median method)

noise_median() estimates the baseline as the median value within a moving window. The resulting baseline is then smoothed by convolving with a Gaussian kernel. Note that this method does not perform well for tightly-grouped peaks.

(Source code, png)


snip (Statistics-sensitive Non-linear Iterative Peak-clipping)

snip() iteratively takes the element-wise minimimum of each value and the average of the values at the left and right edge of a window centered at the value. The size of the half-window is incrementally increased from 1 to the specified maximum size, which should be set to approximately half of the index-based width of the largest peak or feature.

(Source code, png)


A smoother baseline can be obtained from the snip function by setting decreasing to True, which reverses the half-window size range to start at the maximum size and end at 1. Further, smoothing can optionally be performed to make the baseline better fit noisy data. The baselines when using decreasing window size and smoothing is shown below.

(Source code, png)


swima (Small-Window Moving Average)

swima() iteratively takes the element-wise minimum of either the data (first iteration) or the previous iteration's baseline and the data/previous baseline smoothed with a moving average. The window used for the moving average smoothing is incrementally increased to smooth peaks until convergence is reached.

(Source code, png)


ipsa (Iterative Polynomial Smoothing Algorithm)

ipsa() iteratively smooths the input data using a second-order Savitzky–Golay filter until the exit criteria is reached.

(Source code, png)


ria (Range Independent Algorithm)

ria() first extrapolates a linear baseline from the left and/or right edges of the data and adds Gaussian peaks to these baselines, similar to the optimize_extended_range function, and records their initial areas. The data is then iteratively smoothed using a zero-order Savitzky–Golay filter (moving average) until the area of the extended regions after subtracting the smoothed data from the initial data is close to their starting areas.

(Source code, png)