pybaselines.Baseline.fastchrom
- Baseline.fastchrom(data, half_window=None, threshold=None, min_fwhm=None, interp_half_window=5, smooth_half_window=None, weights=None, max_iter=100, min_length=2, pad_kwargs=None, **kwargs)[source]
Identifies baseline segments by thresholding the rolling standard deviation distribution.
Baseline points are identified as any point where the rolling standard deviation is less than the specified threshold. Peak regions are iteratively interpolated until the baseline is below the data.
- Parameters:
- dataarray_like, shape (N,)
The y-values of the measured data, with N data points.
- half_window
int, optional The half-window to use for the rolling standard deviation calculation. Should be approximately equal to the full-width-at-half-maximum of the peaks or features in the data. Default is None, which will use half of the value from
optimize_window(), which is not always a good value, but at least scales with the number of data points and gives a starting point for tuning the parameter.- threshold
floator callable, optional All points in the rolling standard deviation below threshold will be considered as baseline. Higher values will assign more points as baseline. Default is None, which will set the threshold as the 15th percentile of the rolling standard deviation. If threshold is Callable, it should take the rolling standard deviation as the only argument and output a float.
- min_fwhm
int, optional After creating the interpolated baseline, any region where the baseline is greater than the data for min_fwhm consecutive points will have an additional baseline point added and reinterpolated. Should be set to approximately the index-based full-width-at-half-maximum of the smallest peak. Default is None, which uses 2 * half_window.
- interp_half_window
int, optional When interpolating between baseline segments, will use the average of
data[i-interp_half_window:i+interp_half_window+1], where i is the index of the peak start or end, to fit the linear segment. Default is 5.- smooth_half_window
int, optional The half window to use for smoothing the interpolated baseline with a moving average. Default is None, which will use half_window. Set to 0 to not smooth the baseline.
- weightsarray_like, shape (N,), optional
The weighting array, used to override the function's baseline identification to designate peak points. Only elements with 0 or False values will have an effect; all non-zero values are considered baseline points. If None (default), then will be an array with size equal to N and all values set to 1.
- max_iter
int, optional The maximum number of iterations to attempt to fill in regions where the baseline is greater than the input data. Default is 100.
- min_length
int, optional Any region of consecutive baseline points less than min_length is considered to be a false positive and all points in the region are converted to peak points. A higher min_length ensures less points are falsely assigned as baseline points. Default is 2, which only removes lone baseline points.
- pad_kwargs
dict, optional A dictionary of keyword arguments to pass to
pad_edges()for padding the edges of the data to prevent edge effects from smoothing. Default is None.- **kwargs
Deprecated since version 1.2.0: Passing additional keyword arguments is deprecated and will be removed in version 1.4.0. Pass keyword arguments using pad_kwargs.
- Returns:
- baseline
numpy.ndarray, shape (N,) The calculated baseline.
- params
dict A dictionary with the following items:
- 'mask': numpy.ndarray, shape (N,)
The boolean array designating baseline points as True and peak points as False.
- baseline
Notes
Only covers the baseline correction from FastChrom, not its peak finding and peak grouping capabilities.
References
Johnsen, L., et al. An automated method for baseline correction, peak finding and peak grouping in chromatographic data. Analyst. 2013, 138, 3502-3511.