pybaselines.Baseline2D.irsqr

Baseline2D.irsqr(data, lam=1000.0, quantile=0.05, num_knots=25, spline_degree=3, diff_order=3, max_iter=100, tol=1e-06, weights=None, eps=None)[source]

Iterative Reweighted Spline Quantile Regression (IRSQR).

Fits the baseline using quantile regression with penalized splines.

Parameters:
dataarray_like, shape (M, N)

The y-values of the measured data. Must not contain missing data (NaN) or Inf.

lamfloat or sequence[float, float], optional

The smoothing parameter for the rows and columns, respectively. If a single value is given, both will use the same value. Larger values will create smoother baselines. Default is 1e3.

quantilefloat, optional

The quantile at which to fit the baseline. Default is 0.05.

num_knotsint or sequence[int, int], optional

The number of knots for the splines along the rows and columns, respectively. If a single value is given, both will use the same value. Default is 25.

spline_degreeint or sequence[int, int], optional

The degree of the splines along the rows and columns, respectively. If a single value is given, both will use the same value. Default is 3, which is a cubic spline.

diff_orderint or sequence[int, int], optional

The order of the differential matrix for the rows and columns, respectively. If a single value is given, both will use the same value. Must be greater than 0. Default is 3 (third order differential matrix). Typical values are 2 or 3.

max_iterint, optional

The max number of fit iterations. Default is 100.

tolfloat, optional

The exit criteria. Default is 1e-6.

weightsarray_like, shape (M, N), optional

The weighting array. If None (default), then the initial weights will be an array with shape equal to (M, N) and all values set to 1.

epsfloat, optional

A small value added to the square of the residual to prevent dividing by 0. Default is None, which uses the square of the maximum-absolute-value of the fit each iteration multiplied by 1e-6.

Returns:
baselinenumpy.ndarray, shape (M, N)

The calculated baseline.

paramsdict

A dictionary with the following items:

  • 'weights': numpy.ndarray, shape (M, N)

    The weight array used for fitting the data.

  • 'tol_history': numpy.ndarray

    An array containing the calculated tolerance values for each iteration. The length of the array is the number of iterations completed. If the last value in the array is greater than the input tol value, then the function did not converge.

Raises:
ValueError

Raised if quantile is not between 0 and 1.

References

Han, Q., et al. Iterative Reweighted Quantile Regression Using Augmented Lagrangian Optimization for Baseline Correction. 2018 5th International Conference on Information Science and Control Engineering (ICISCE), 2018, 280-284.