pybaselines.Baseline2D.aspls

Baseline2D.aspls(data, lam=100000.0, diff_order=2, max_iter=100, tol=0.001, weights=None, alpha=None, asymmetric_coef=0.5)[source]

Adaptive smoothness penalized least squares smoothing (asPLS).

Parameters:
dataarray_like, shape (M, N)

The y-values of the measured data. Must not contain missing data (NaN) or Inf.

lamfloat or sequence[float, float], optional

The smoothing parameter for the rows and columns, respectively. If a single value is given, both will use the same value. Larger values will create smoother baselines. Default is 1e5.

diff_orderint or sequence[int, int], optional

The order of the differential matrix for the rows and columns, respectively. If a single value is given, both will use the same value. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 2 or 1.

max_iterint, optional

The max number of fit iterations. Default is 50.

tolfloat, optional

The exit criteria. Default is 1e-3.

weightsarray_like, shape (M, N), optional

The weighting array. If None (default), then the initial weights will be an array with shape equal to (M, N) and all values set to 1.

alphaarray_like, shape (M, N), optional

An array of values that control the local value of lam to better fit peak and non-peak regions. If None (default), then the initial values will be an array with shape equal to (M, N) and all values set to 1.

asymmetric_coeffloat

The asymmetric coefficient for the weighting. Higher values leads to a steeper weighting curve (ie. more step-like). Default is 0.5.

Returns:
baselinenumpy.ndarray, shape (M, N)

The calculated baseline.

paramsdict

A dictionary with the following items:

  • 'weights': numpy.ndarray, shape (M, N)

    The weight array used for fitting the data.

  • 'alpha': numpy.ndarray, shape (M, N)

    The array of alpha values used for fitting the data in the final iteration.

  • 'tol_history': numpy.ndarray

    An array containing the calculated tolerance values for each iteration. The length of the array is the number of iterations completed. If the last value in the array is greater than the input tol value, then the function did not converge.

Raises:
ValueError

Raised if alpha and data do not have the same shape. Also raised if asymmetric_coef is not greater than 0.

Notes

The default asymmetric coefficient (k in the asPLS paper) is 0.5 instead of the 2 listed in the asPLS paper. pybaselines uses the factor of 0.5 since it matches the results in Table 2 and Figure 5 of the asPLS paper closer than the factor of 2 and fits noisy data much better.

References

Zhang, F., et al. Baseline correction for infrared spectra using adaptive smoothness parameter penalized least squares method. Spectroscopy Letters, 2020, 53(3), 222-233.