pybaselines.morphological
Module Contents
Functions
Iteratively averaging morphological and mollified (aMorMol) baseline. |
|
An Improved Morphological based (IMor) baseline algorithm. |
|
Joint Baseline Correction and Denoising (jbcd) Algorithm. |
|
A Morphological based (Mor) baseline algorithm. |
|
Iterative morphological and mollified (MorMol) baseline. |
|
The Morphological penalized least squares (MPLS) baseline algorithm. |
|
Morphology-based penalized spline baseline. |
|
Moving window minimum value (MWMV) baseline. |
|
The rolling ball baseline algorithm. |
|
Estimates the baseline using a top-hat transformation (morphological opening). |
- pybaselines.morphological.amormol(data, half_window=None, tol=0.001, max_iter=200, pad_kwargs=None, x_data=None, **window_kwargs)[source]
Iteratively averaging morphological and mollified (aMorMol) baseline.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- tolfloat, optional
The exit criteria. Default is 1e-3.
- max_iterint, optional
The maximum number of iterations. Default is 200.
- pad_kwargsdict, optional
A dictionary of keyword arguments to pass to
pad_edges()
for padding the edges of the data to prevent edge effects from convolution.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- paramsdict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
- 'tol_history': numpy.ndarray
An array containing the calculated tolerance values for each iteration. The length of the array is the number of iterations completed. If the last value in the array is greater than the input tol value, then the function did not converge.
References
Chen, H., et al. An Adaptive and Fully Automated Baseline Correction Method for Raman Spectroscopy Based on Morphological Operations and Mollifications. Applied Spectroscopy, 2019, 73(3), 284-293.
- pybaselines.morphological.imor(data, half_window=None, tol=0.001, max_iter=200, x_data=None, **window_kwargs)[source]
An Improved Morphological based (IMor) baseline algorithm.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- tolfloat, optional
The exit criteria. Default is 1e-3.
- max_iterint, optional
The maximum number of iterations. Default is 200.
- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- paramsdict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
- 'tol_history': numpy.ndarray
An array containing the calculated tolerance values for each iteration. The length of the array is the number of iterations completed. If the last value in the array is greater than the input tol value, then the function did not converge.
References
Dai, L., et al. An Automated Baseline Correction Method Based on Iterative Morphological Operations. Applied Spectroscopy, 2018, 72(5), 731-739.
- pybaselines.morphological.jbcd(data, half_window=None, alpha=0.1, beta=10.0, gamma=1.0, beta_mult=1.1, gamma_mult=0.909, diff_order=1, max_iter=20, tol=0.01, tol_2=0.001, robust_opening=True, x_data=None, **window_kwargs)[source]
Joint Baseline Correction and Denoising (jbcd) Algorithm.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- alphafloat, optional
The regularization parameter that controls how close the baseline must fit the calculated morphological opening. Larger values make the fit more constrained to the opening and can make the baseline less smooth. Default is 0.1.
- betafloat, optional
The regularization parameter that controls how smooth the baseline is. Larger values produce smoother baselines. Default is 1e1.
- gammafloat, optional
The regularization parameter that controls how smooth the signal is. Larger values produce smoother baselines. Default is 1.
- beta_multfloat, optional
The value that beta is multiplied by each iteration. Default is 1.1.
- gamma_multfloat, optional
The value that gamma is multiplied by each iteration. Default is 0.909.
- diff_orderint, optional
The order of the differential matrix. Must be greater than 0. Default is 1 (first order differential matrix). Typical values are 2 or 1.
- max_iterint, optional
The maximum number of iterations. Default is 20.
- tolfloat, optional
The exit criteria for the change in the calculated signal. Default is 1e-2.
- tol_2float, optional
The exit criteria for the change in the calculated baseline. Default is 1e-2.
- robust_openingbool, optional
If True (default), the opening used to represent the initial baseline is the element-wise minimum between the morphological opening and the average of the morphological erosion and dilation of the opening, similar to
mor()
. If False, the opening is just the morphological opening, as used in the reference. The robust opening typically represents the baseline better.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- paramsdict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
- 'tol_history': numpy.ndarray, shape (K, 2)
An array containing the calculated tolerance values for each iteration. Index 0 are the tolerence values for the relative change in the signal, and index 1 are the tolerance values for the relative change in the baseline. The length of the array is the number of iterations completed, K. If the last values in the array are greater than the input tol or tol_2 values, then the function did not converge.
- 'signal': numpy.ndarray, shape (N,)
The pure signal portion of the input data without noise or the baseline.
References
Liu, H., et al. Joint Baseline-Correction and Denoising for Raman Spectra. Applied Spectroscopy, 2015, 69(9), 1013-1022.
- pybaselines.morphological.mor(data, half_window=None, x_data=None, **window_kwargs)[source]
A Morphological based (Mor) baseline algorithm.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- dict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
References
Perez-Pueyo, R., et al. Morphology-Based Automated Baseline Removal for Raman Spectra of Artistic Pigments. Applied Spectroscopy, 2010, 64, 595-600.
- pybaselines.morphological.mormol(data, half_window=None, tol=0.001, max_iter=250, smooth_half_window=None, pad_kwargs=None, x_data=None, **window_kwargs)[source]
Iterative morphological and mollified (MorMol) baseline.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- tolfloat, optional
The exit criteria. Default is 1e-3.
- max_iterint, optional
The maximum number of iterations. Default is 200.
- smooth_half_windowint, optional
The half-window to use for smoothing the data before performing the morphological operation. Default is None, which will use a value of 1, which gives no smoothing.
- pad_kwargsdict, optional
A dictionary of keyword arguments to pass to
pad_edges()
for padding the edges of the data to prevent edge effects from convolution.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- paramsdict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
- 'tol_history': numpy.ndarray
An array containing the calculated tolerance values for each iteration. The length of the array is the number of iterations completed. If the last value in the array is greater than the input tol value, then the function did not converge.
References
Koch, M., et al. Iterative morphological and mollifier-based baseline correction for Raman spectra. J Raman Spectroscopy, 2017, 48(2), 336-342.
- pybaselines.morphological.mpls(data, half_window=None, lam=1000000.0, p=0.0, diff_order=2, tol=0.001, max_iter=50, weights=None, x_data=None, **window_kwargs)[source]
The Morphological penalized least squares (MPLS) baseline algorithm.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- lamfloat, optional
The smoothing parameter. Larger values will create smoother baselines. Default is 1e6.
- pfloat, optional
The penalizing weighting factor. Must be between 0 and 1. Anchor points identified by the procedure in [1] are given a weight of 1 - p, and all other points have a weight of p. Default is 0.0.
- diff_orderint, optional
The order of the differential matrix. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 2 or 1.
- max_iterint, optional
The max number of fit iterations. Default is 50.
- tolfloat, optional
The exit criteria. Default is 1e-3.
- weightsarray-like, shape (N,), optional
The weighting array. If None (default), then the weights will be calculated following the procedure in [1].
- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- paramsdict
A dictionary with the following items:
- 'weights': numpy.ndarray, shape (N,)
The weight array used for fitting the data.
- 'half_window': int
The half window used for the morphological calculations.
- Raises:
- ValueError
Raised if p is not between 0 and 1.
References
- pybaselines.morphological.mpspline(data, half_window=None, lam=10000.0, lam_smooth=0.01, p=0.0, num_knots=100, spline_degree=3, diff_order=2, weights=None, pad_kwargs=None, x_data=None, **window_kwargs)[source]
Morphology-based penalized spline baseline.
Identifies baseline points using morphological operations, and then uses weighted least-squares to fit a penalized spline to the baseline.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- lamfloat, optional
The smoothing parameter for the penalized spline when fitting the baseline. Larger values will create smoother baselines. Default is 1e4. Larger values are needed for larger num_knots.
- lam_smoothfloat, optional
The smoothing parameter for the penalized spline when smoothing the input data. Default is 1e-2. Larger values are needed for noisy data or for larger num_knots.
- pfloat, optional
The penalizing weighting factor. Must be between 0 and 1. Anchor points identified by the procedure in the reference are given a weight of 1 - p, and all other points have a weight of p. Default is 0.0.
- num_knotsint, optional
The number of knots for the spline. Default is 100.
- spline_degreeint, optional
The degree of the spline. Default is 3, which is a cubic spline.
- diff_orderint, optional
The order of the differential matrix. Must be greater than 0. Default is 2 (second order differential matrix). Typical values are 2 or 3.
- weightsarray-like, shape (N,), optional
The weighting array. If None (default), then the weights will be calculated following the procedure in the reference.
- x_dataarray-like, shape (N,), optional
The x-values of the measured data. Default is None, which will create an array from -1 to 1 with N points.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- paramsdict
A dictionary with the following items:
- 'weights': numpy.ndarray, shape (N,)
The weight array used for fitting the data.
- 'half_window': int
The half window used for the morphological calculations.
- Raises:
- ValueError
Raised if half_window is < 1, if lam or lam_smooth is <= 0, or if p is not between 0 and 1.
Notes
The optimal opening is calculated as the element-wise minimum of the opening and the average of the erosion and dilation of the opening. The reference used the erosion and dilation of the smoothed data, rather than the opening, which tends to overestimate the baseline.
Rather than setting knots at the intersection points of the optimal opening and the smoothed data as described in the reference, weights are assigned to 1 - p at the intersection points and p elsewhere. This simplifies the penalized spline calculation by allowing the use of equally spaced knots, but should otherwise give similar results as the reference algorithm.
References
Gonzalez-Vidal, J., et al. Automatic morphology-based cubic p-spline fitting methodology for smoothing and baseline-removal of Raman spectra. Journal of Raman Spectroscopy. 2017, 48(6), 878-883.
- pybaselines.morphological.mwmv(data, half_window=None, smooth_half_window=None, pad_kwargs=None, x_data=None, **window_kwargs)[source]
Moving window minimum value (MWMV) baseline.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- smooth_half_windowint, optional
The half-window to use for smoothing the data after performing the morphological operation. Default is None, which will use the same value as used for the morphological operation.
- pad_kwargsdict, optional
A dictionary of keyword arguments to pass to
pad_edges()
for padding the edges of the data to prevent edge effects from the moving average.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- dict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
Notes
Performs poorly when baseline is rapidly changing.
References
Yaroshchyk, P., et al. Automatic correction of continuum background in Laser-induced Breakdown Spectroscopy using a model-free algorithm. Spectrochimica Acta Part B, 2014, 99, 138-149.
- pybaselines.morphological.rolling_ball(data, half_window=None, smooth_half_window=None, pad_kwargs=None, x_data=None, **window_kwargs)[source]
The rolling ball baseline algorithm.
Applies a minimum and then maximum moving window, and subsequently smooths the result, giving a baseline that resembles rolling a ball across the data.
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphology functions. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- smooth_half_windowint, optional
The half-window to use for smoothing the data after performing the morphological operation. Default is None, which will use the same value as used for the morphological operation.
- pad_kwargsdict, optional
A dictionary of keyword arguments to pass to
pad_edges()
for padding the edges of the data to prevent edge effects from the moving average.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- dict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
References
Kneen, M.A., et al. Algorithm for fitting XRF, SEM and PIXE X-ray spectra backgrounds. Nuclear Instruments and Methods in Physics Research B, 1996, 109, 209-213.
Liland, K., et al. Optimal Choice of Baseline Correction for Multivariate Calibration of Spectra. Applied Spectroscopy, 2010, 64(9), 1007-1016.
- pybaselines.morphological.tophat(data, half_window=None, x_data=None, **window_kwargs)[source]
Estimates the baseline using a top-hat transformation (morphological opening).
- Parameters:
- dataarray-like, shape (N,)
The y-values of the measured data, with N data points.
- half_windowint, optional
The half-window used for the morphological opening. If a value is input, then that value will be used. Default is None, which will optimize the half-window size using
optimize_window()
and window_kwargs.- x_dataarray-like, optional
The x-values. Not used by this function, but input is allowed for consistency with other functions.
- **window_kwargs
Values for setting the half window used for the morphology operations. Items include:
- 'increment': int
The step size for iterating half windows. Default is 1.
- 'max_hits': int
The number of consecutive half windows that must produce the same morphological opening before accepting the half window as the optimum value. Default is 1.
- 'window_tol': float
The tolerance value for considering two morphological openings as equivalent. Default is 1e-6.
- 'max_half_window': int
The maximum allowable window size. If None (default), will be set to (len(data) - 1) / 2.
- 'min_half_window': int
The minimum half-window size. If None (default), will be set to 1.
- Returns:
- baselinenumpy.ndarray, shape (N,)
The calculated baseline.
- dict
A dictionary with the following items:
- 'half_window': int
The half window used for the morphological calculations.
Notes
The actual top-hat transformation is defined as data - opening(data), where opening is the morphological opening operation. This function, however, returns opening(data), since that is technically the baseline defined by the operation.
References
Perez-Pueyo, R., et al. Morphology-Based Automated Baseline Removal for Raman Spectra of Artistic Pigments. Applied Spectroscopy, 2010, 64, 595-600.