pybaselines.Baseline.optimize_extended_range
- Baseline.optimize_extended_range(data, method='asls', side='both', width_scale=0.1, height_scale=1.0, sigma_scale=0.08333333333333333, min_value=2, max_value=8, step=1, pad_kwargs=None, method_kwargs=None)[source]
Extends data and finds the best parameter value for the given baseline method.
Adds additional data to the left and/or right of the input data, and then iterates through parameter values to find the best fit. Useful for calculating the optimum lam or poly_order value required to optimize other algorithms.
- Parameters:
- dataarray_like, shape (N,)
The y-values of the measured data, with N data points.
- method
str
, optional A string indicating the Whittaker-smoothing-based, polynomial, or spline method to use for fitting the baseline. Default is 'asls'.
- side{'both', 'left', 'right'}, optional
The side of the measured data to extend. Default is 'both'.
- width_scale
float
, optional The number of data points added to each side is width_scale * N. Default is 0.1.
- height_scale
float
, optional The height of the added Gaussian peak(s) is calculated as height_scale * max(data). Default is 1.
- sigma_scale
float
, optional The sigma value for the added Gaussian peak(s) is calculated as sigma_scale * width_scale * N. Default is 1/12, which will make the Gaussian span +- 6 sigma, making its total width about half of the added length.
- min_value
int
orfloat
, optional The minimum value for the lam or poly_order value to use with the indicated method. If using a polynomial method, min_value must be an integer. If using a Whittaker-smoothing-based method, min_value should be the exponent to raise to the power of 10 (eg. a min_value value of 2 designates a lam value of 10**2). Default is 2.
- max_value
int
orfloat
, optional The maximum value for the lam or poly_order value to use with the indicated method. If using a polynomial method, max_value must be an integer. If using a Whittaker-smoothing-based method, max_value should be the exponent to raise to the power of 10 (eg. a max_value value of 3 designates a lam value of 10**3). Default is 8.
- step
int
orfloat
, optional The step size for iterating the parameter value from min_value to max_value. If using a polynomial method, step must be an integer. If using a Whittaker-smoothing-based method, step should be the exponent to raise to the power of 10 (eg. a step value of 1 designates a lam value of 10**1). Default is 1.
- pad_kwargs
dict
, optional A dictionary of options to pass to
pad_edges()
for padding the edges of the data when adding the extended left and/or right sections. Default is None, which will use an empty dictionary.- method_kwargs
dict
, optional A dictionary of keyword arguments to pass to the selected method function. Default is None, which will use an empty dictionary.
- Returns:
- baseline
numpy.ndarray
, shape (N,) The baseline calculated with the optimum parameter.
- method_params
dict
A dictionary with the following items:
- 'optimal_parameter': int or float
The lam or poly_order value that produced the lowest root-mean-squared-error.
'min_rmse': float
Deprecated since version 1.2.0: The 'min_rmse' key will be removed from the
method_params
dictionary in pybaselines version 1.4.0 in favor of the new 'rmse' key which returns all root-mean-squared-error values.- 'rmse'numpy.ndarray
The array of the calculated root-mean-squared-error for each of the fits.
- 'method_params': dict
A dictionary containing the output parameters for the optimal fit. Items will depend on the selected method.
- baseline
- Raises:
ValueError
Raised if side is not 'left', 'right', or 'both'.
TypeError
Raised if using a polynomial method and min_value, max_value, or step is not an integer.
ValueError
Raised if using a Whittaker-smoothing-based method and min_value, max_value, or step is greater than 100.
Notes
Based on the extended range penalized least squares (erPLS) method from [1]. The method proposed by [1] was for optimizing lambda only for the aspls method by extending only the right side of the spectrum. The method was modified by allowing extending either side following [2], and for optimizing lambda or the polynomial degree for all of the affected algorithms in pybaselines.
References
[1] (1,2)Zhang, F., et al. An Automatic Baseline Correction Method Based on the Penalized Least Squares Method. Sensors, 2020, 20(7), 2015.
[2]Krishna, H., et al. Range-independent background subtraction algorithm for recovery of Raman spectra of biological tissue. Journal of Raman Spectroscopy. 2012, 43(12), 1884-1894.