Optimizer Baselines

Introduction

Optimizer algorithms build upon other baseline algorithms to improve their results.

Algorithms

optimize_extended_range

The optimize_extended_range() function is based on the Extended Range Penalized Least Squares (erPLS) method, but extends its usage to all Whittaker-smoothing-based, polynomial, and spline algorithms.

In this algorithm, a linear baseline is extrapolated from the left and/or right edges, Gaussian peaks are added to these baselines, and then the original data plus the extensions are input into the indicated Whittaker or polynomial function. An example of data with added baseline and Gaussian peaks is shown below.

(Source code)

A range of lam or poly_order values are tested, and the value that best fits the added linear regions is selected as the optimal parameter.

(Source code)

It should be noted that the optimal lam value obtained from optimize_extended_range() cannot be directly used for fitting other data using the same method since the optimal lam value corresponds to the padded data; since lam has a dependance on data size, the optimal lam value for fitting non-padded data will be slightly lower than the optimal value obtained from optimize_extended_range().

collab_pls (Collaborative Penalized Least Squares)

collab_pls() is intended for fitting multiple datasets of related data, and can use any Whittaker-smoothing-based or spline method. The general idea is that using multiple sets of data should be better able to estimate the overall baseline rather than individually fitting each set of data.

There are two ways the collab_pls function can fit datasets. The dataset can be averaged and then fit once with the selected method, and then the output weights are used to individually fit each set of data. The other method individually fits each set of data, averages the weighting, and then uses the averaged weights to individually fit each set of data. The figure below shows the comparison of the baselines fit by the collab_pls algorithm versus the individual baselines from the mpls method.

(Source code)

There is no figure showing the fits for various baseline types for this method since it requires multiple sets of data for each baseline type.

adaptive_minmax (Adaptive MinMax)

adaptive_minmax() uses two different polynomial orders and two different weighting schemes to create a total of four fits. The polynomial order(s) can be specified by the user, or else they will be estimated by the signal-to-noise ratio of the data. The first weighting scheme is either all points weighted equally or using user-specified weights. The second weighting scheme places a much higher weight on points near the two ends of the data to provide better fits in certain circumstances.

Each of the four fits uses thresholding (the "min" part of the name) to estimate the baseline. The final baseline is then computed as the element-wise maximum of the four fits (the "max" part of the name).

(Source code)

custom_bc (Customized Baseline Correction)

custom_bc() allows fine tuning the stiffness of the baseline within different regions of the fit data, which is helpful when experimental data has drastically different baselines within it. This is done by reducing the number of data points in regions where higher stiffness is required. There is no figure showing the fits for various baseline types for this method since it is more suited for hard-to-fit data; however, an example showcases its use.