.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "generated/examples/general/plot_reuse_Baseline.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_generated_examples_general_plot_reuse_Baseline.py: Fitting Multiple Datasets ------------------------- When fitting multiple datasets that all share the same independant variable, pybaselines allows saving time by reusing the same :class:`~.Baseline` object to allow only performing some of the computationally heavy setup only once. For example, :doc:`polynomial methods <../../../algorithms/polynomial>` will only compute the Vandermonde matrix, and potentially its pseudoinverse, once. Likewise, :doc:`spline methods <../../../algorithms/spline>` will only have to compute the spline basis matrix once. Note that this only applies if the same non-data parameters (eg. ``poly_order``, ``lam``, etc.) are used for each fit. This example will explore the efficiency of reusing the same ``Baseline`` object when fitting multiple datasets for different types of algorithms. .. GENERATED FROM PYTHON SOURCE LINES 19-45 .. code-block:: Python from collections import defaultdict from time import perf_counter import matplotlib.pyplot as plt import numpy as np from pybaselines import Baseline from pybaselines.utils import gaussian num_points = 1000 # number of data points in one set of data num_fits = 1000 # equivalent to the number of data in a dataset x = np.linspace(0, 1000, num_points) signal = ( + gaussian(x, 6, 150, 5) + gaussian(x, 8, 350, 11) + gaussian(x, 6, 550, 6) + gaussian(x, 13, 700, 8) + gaussian(x, 9, 880, 7) ) baseline = 5 + 10 * np.exp(-x / 600) + gaussian(x, 15, 1000, 400) noise = np.random.default_rng(0).normal(0, 0.1, len(x)) y = signal + baseline + noise .. GENERATED FROM PYTHON SOURCE LINES 47-51 Six different methods will be timed. The polynomial method :meth:`~.Baseline.penalized_poly`, the spline method :meth:`~.Baseline.mixture_model`, the Whittaker smoothing method :meth:`~.Baseline.iarpls`, the morphological method :meth:`~.Baseline.mor`, the smoothing method :meth:`~.Baseline.ria`, and the classification method :meth:`~.Baseline.std_distribution`. .. GENERATED FROM PYTHON SOURCE LINES 51-82 .. code-block:: Python methods = ( ('penalized_poly', {'poly_order': 4}), ('mixture_model', {'lam': 1e5}), ('iarpls', {'lam': 1e5}), ('mor', {'half_window': 30}), ('ria', {'half_window': 20}), ('std_distribution', {'half_window': 25}) ) plt.plot(x, y) timings = defaultdict(list) for method, kwargs in methods: baseline_fitter = Baseline(x_data=x, check_finite=False, assume_sorted=True) for reuse_object in (True, False): for i in range(num_fits): if reuse_object: func = getattr(baseline_fitter, method) else: func = getattr(Baseline(x, check_finite=False, assume_sorted=True), method) t0 = perf_counter() calc_baseline = func(y, **kwargs)[0] t1 = perf_counter() if i == 0 and reuse_object: # only plot once per algorithm plt.plot(x, calc_baseline, label=method) elif i > 0: # only add timings after first call to allow for any necessary compilation timings[f'{method}_{reuse_object}'].append(t1 - t0) plt.legend() plt.tight_layout() .. image-sg:: /generated/examples/general/images/sphx_glr_plot_reuse_Baseline_001.png :alt: plot reuse Baseline :srcset: /generated/examples/general/images/sphx_glr_plot_reuse_Baseline_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 83-87 The total times for each method when using a new ``Baseline`` object each call and when reusing the same ``Baseline`` object are plotted below, as well as the relative time reduction by reusing the same ``Baseline`` object. Note that time reductions less than +/-5% can be considered as irrelevant. .. GENERATED FROM PYTHON SOURCE LINES 87-114 .. code-block:: Python fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True) for i, (method, _) in enumerate(methods): reuse = sum(timings[f'{method}_True']) new = sum(timings[f'{method}_False']) if i == 0: new_label = 'New' reuse_label = 'Reuse' else: new_label = '' reuse_label = '' plt.plot() ax1.bar(i - 0.2, new, width=0.4, label=new_label, color='c') ax1.bar(i + 0.2, reuse, width=0.4, label=reuse_label, color='m') speedup = 100 * (new - reuse) / new bar = ax2.bar(i, speedup) ax2.bar_label(bar, fmt='{:.1f}%') ax1.legend() ax2.set_xticks(np.arange(len(methods)), [method for method, _ in methods], rotation=15) ax2.set_ylim(ax2.get_ylim() + np.array([-5, 5])) # add space for the bar labels ax1.set_ylabel('Total time (s)') ax2.set_ylabel('Relative Time Reduction (%)') fig.tight_layout() plt.show() .. image-sg:: /generated/examples/general/images/sphx_glr_plot_reuse_Baseline_002.png :alt: plot reuse Baseline :srcset: /generated/examples/general/images/sphx_glr_plot_reuse_Baseline_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 115-120 As expected, the polynomial and spline methods see a significant time reduction by reusing the same ``Baseline`` object to fit the entire dataset while the other methods see no difference. Note that these results are a generalization; algorithms that are more computationally intensive will see less of a benefit from reuse since less time is from the setup and more from the actual algorithm. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 15.032 seconds) .. _sphx_glr_download_generated_examples_general_plot_reuse_Baseline.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_reuse_Baseline.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_reuse_Baseline.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_reuse_Baseline.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_