pybaselines.Baseline2D.pspline_brpls
- Baseline2D.pspline_brpls(data, lam=1000.0, num_knots=25, spline_degree=3, diff_order=2, max_iter=50, tol=0.001, max_iter_2=50, tol_2=0.001, weights=None)[source]
A penalized spline version of the brPLS algorithm.
- Parameters:
- dataarray_like, shape (M, N)
The y-values of the measured data. Must not contain missing data (NaN) or Inf.
- lam
floator sequence[float,float], optional The smoothing parameter for the rows and columns, respectively. If a single value is given, both will use the same value. Larger values will create smoother baselines. Default is 1e3.
- num_knots
intor sequence[int,int], optional The number of knots for the splines along the rows and columns, respectively. If a single value is given, both will use the same value. Default is 25.
- spline_degree
intor sequence[int,int], optional The degree of the splines along the rows and columns, respectively. If a single value is given, both will use the same value. Default is 3, which is a cubic spline.
- diff_order
intor sequence[int,int], optional The order of the differential matrix for the rows and columns, respectively. If a single value is given, both will use the same value. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 1 or 2.
- max_iter
int, optional The max number of fit iterations. Default is 50.
- tol
float, optional The exit criteria. Default is 1e-3.
- max_iter_2
float, optional The number of iterations for updating the proportion of data occupied by peaks. Default is 50.
- tol_2
float, optional The exit criteria for the difference between the calculated proportion of data occupied by peaks. Default is 1e-3.
- weightsarray_like, shape (M, N), optional
The weighting array. If None (default), then the initial weights will be an array with size equal to N and all values set to 1.
- Returns:
- baseline
numpy.ndarray, shape (M, N) The calculated baseline.
- params
dict A dictionary with the following items:
- 'weights': numpy.ndarray, shape (M, N)
The weight array used for fitting the data.
- 'tol_history': numpy.ndarray, shape (J, K)
An array containing the calculated tolerance values for each iteration of both threshold values and fit values. Index 0 are the tolerence values for the difference in the peak proportion, and indices >= 1 are the tolerance values for each fit. All values that were not used in fitting have values of 0. Shape J is 2 plus the number of iterations for the threshold to converge (related to max_iter_2, tol_2), and shape K is the maximum of the number of iterations for the threshold and the maximum number of iterations for all of the fits of the various threshold values (related to max_iter and tol).
- baseline
See also
References
Wang, Q., et al. Spectral baseline estimation using penalized least squares with weights derived from the Bayesian method. Nuclear Science and Techniques, 2022, 140, 250-257.
Eilers, P., et al. Splines, knots, and penalties. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(6), 637-653.