Zfit

Latest version: v0.20.3

Safety actively analyzes 621622 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 11

2.1

Major Features and Improvements
-------------------------------

- Fixed the comparison in caching the graph (implementation detail) that leads to an error.

0.20.3

========================

Bug fixes and small changes
---------------------------
- consistent behavior in loss: simple loss can take a gradient and hesse function and the default base loss provides fallbacks that work correctly between ``value_gradient`` and ``gradient``. This maybe matters if you've implemented a custom loss and should fix any issues with it.
- multiprocessing would get stuck due to an `upstream bug in TensorFlow <https://github.com/tensorflow/tensorflow/issues/66115>`_. Working around it by disabling an unused piece of code.

Thanks
------
- acampoverde for finding the bug in the multiprocessing

0.20.2

========================

Two small bugfixes
- fix backwards incompatible change of sampler
- detect if a RegularBinning has been transformed, raise error.

0.20.1

========================

Major Features and Improvements
-------------------------------
- fix dumping and add convenience wrapper ``zfit.dill`` to dump and load objects with dill (a more powerful pickle). This way, any zfit object can be saved and loaded, such as ``FitResult`` that contains all other important objects to recreate the fit.
- improved performance for numerical gradient calculation, fixing also a minor numerical issue.

Bug fixes and small changes
---------------------------
- runing binned fits without a graph could deadlock, fixed.

0.20.0

========================

Complete overhaul of zfit with a focus on usability and a variety of new pdfs!


Major Features and Improvements
-------------------------------
- Parameter behavior has changed, multiple parameters with the same name can now coexist!
The ``NameAlreadyTakenError`` has been successfully removed (yay!). The new behavior only enforces that
names and matching parameters *within a function/PDF/loss* are unique, as otherwise inconsistent expectations appear (for the full discussion on this, see `here <https://github.com/zfit/zfit/discussions/342>`_).
- ``Space`` and limits have a complete overhaul in front of them, in short, these overcomplicated objects get simplified and the limits become more usable, in terms of dimensions. The full discussion and changes can be `found here <https://github.com/zfit/zfit/discussions/533>`_ .
- add an unbinned ``Sampler`` to the public namespace under ``zfit.data.Sampler``: this object is returned in the ``create_sampler`` method and allows to resample from a function without recreating the compiled function, i.e. loss. It has an additional method ``update_data`` to update the data without recompiling the loss and can be created from a sample only. Useful to have a custom dataset in toys.
- allow to use pandas DataFrame as input where zfit Data objects are expected
- Methods of PDFs and loss functions that depend on parameters take now the value of a parameter explicitly as arguments, as a mapping of str (parameter name) to value.
- Python 3.12 support
- add ``GeneralizedCB`` PDF which is similar to the ``DoubleCB`` PDF but with different standard deviations for the left and right side.
- Added functor for PDF caching ``CachedPDF``: ``pdf``, ``integrate`` PDF methods can be cacheable now
- add ``faddeeva_humlicek`` function under the ``zfit.z.numpy`` namespace. This is an implementation of the Faddeeva function, combining Humlicek's rational approximations according to Humlicek (JQSRT, 1979) and Humlicek (JQSRT, 1982).
- add ``Voigt`` profile PDF which is a convolution of a Gaussian and a Cauchy distribution.
- add ``TruncatedPDF`` that allows to truncate in one or multiple ranges (replaces "MultipleLimits" and "MultiSpace")
- add ``LogNormal`` PDF, a log-normal distribution, which is a normal distribution of the logarithm of the variable.
- add ``ChiSquared`` PDF, the standard chi2 distribution, taken from `tensorflow-probability implementation <https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/Chi2>`_.
- add ``StudentT`` PDF, the standard Student's t distribution, taken from `tensorflow-probability implementation <https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/StudentT>`_.
- add ``GaussExpTail`` and ``GeneralizedGaussExpTail`` PDFs, which are a Gaussian with an exponential tail on one side and a Gaussian with different sigmas on each side and different exponential tails on each side respectively.
- add ``QGauss`` PDF, a distribution that arises from the maximization of the Tsallis entropy under appropriate constraints, see `here <https://en.wikipedia.org/wiki/Q-Gaussian_distribution>`_.
- add ``BifurGauss`` PDF, a Gaussian distribution with different sigmas on each side of the mean.
- add ``Bernstein`` PDF, which is a PDF defined by a linear combination of Bernstein polynomials given their coefficients.
- add ``Gamma`` PDF, the Gamma distribution.
- ``Data`` has now a ``with_weights`` method that returns a new data object with different weights and an improved ``with_obs`` that allows to set obs with new limits. These replace the ``set_weights`` and ``set_data_range`` methods for a more functional approach.
- add ``label`` to different objects (PDF, Data, etc.) that allows to give a human-readable name to the object. This is used in the plotting and can be used to identify objects.
Notably, Parameters have a label that can be arbitrary. ``Space`` has one label for each observable if the space is a product of spaces. ``Space.label`` is a string and only possible for one-dimensional spaces, while ``Space.labels`` is a list of strings and can be used for any, one- or multi-dimensional spaces.
- add ``zfit.data.concat(...)`` to concatenate multiple data objects into one along the index or along the observables. Similar to ``pd.concat``.
- PDFs now have a ``to_truncated`` method that allows to create a truncated version of the PDF, possibly with different and multiple limits. This allows to easily create a PDF with disjoint limits.
- ``Data`` and ``PDF`` that take ``obs`` in the initialization can now also take binned observables, i.e. a ``zfit.Space`` with ``binning=...`` and will return a binned version of the object (``zfit.data.BinnedData`` or ``zfit.pdf.BinnedFromUnbinned``, where the latter is a generic wrapper). This is equivalent of calling ``to_binned`` on the objects)
- ``zfit.Data`` can be instantiated directly with most data types, such as numpy arrays, pandas DataFrames etc insead of using the dedicated constructors ``from_numpy``, ``from_pandas`` etc.
The constructors may still provide additional functionality, but overall, the switch should be seamless.


Breaking changes
------------------
This release contains multiple "breaking changes", however, the vast majority if not all apply only for edge cases and undocummented functions.

- a few arguments are now keyword-only arguments. This *can* break existing code if the arguments were given as positional arguments. Just use the appropriate keyword arguments instead.
(Example: instead of using ``zfit.Space(obs, limits)`` use ``zfit.Space(obs, limits=limits)``).
This was introduced to make the API more robust and to avoid errors due to the order of arguments, with a few new ways of creating objects.
- ``Data.from_root``: deprecated arguments ``branches`` and ``branch_aliases`` have been removed. Use ``obs`` and ``obs_aliases`` instead.
- ``NameAlreadyTakenError`` was removed, see above for the new behavior. This should not have an effect on any existing code *except if you relied on the error being thrown*.
- Data objects had an intrinsic, TensorFlow V1 legacy behavior: they were actually cut when the data was *retrieved*. This is now changed and the data is cut when it is created. This should not have any impact on existing code and just improve runtime and memory usage.
- Partial integration used to use some broadcasting tricks that could potentially fail. It uses now a dynamic while loop that _could_ be slower but works for arbitrary PDFs. This should not have any impact on existing code and just improve stability (but technically, the data given to the PDF *if doing partial integration* is now "different", in the sense that it's now not different anymore from any other call)
- if a ``tf.Variable`` was used to store the number of sampled values in a sampler, it was possible to change the value of that variable to change the number of samples drawn. This is now not possible anymore and the number of samples should be given as an argument ``n`` to the ``resample`` method, as was possible since a long time.
- ``create_sampler`` has a breaking change for ``fixed_params``: when the argument was set to False, any change in the parameters would be reflected when resampling.
This highly statebased behavior was confusing and is now removed. The argument is now called ``params``
and behaves as expected: the sampler will remember the parameters at the time of creation,
possibly updated with ``params`` and will not change anymore. To sample from a different set of parameters,
the params have to be passed to the ``resample`` method _explicitly_.
- the default names for ``hesse`` and ``errors`` have now been changed to ``hesse`` and ``errors``, respectively.
This was deprecated since a while and both names were available for backwards compatibility. The old names are now removed. If you get an error, ``minuit_hessse`` or ``minuit_minos`` not found, just replace it with ``hesse`` and ``errors``.



Deprecations
-------------
- ``result.fminfull`` is deprecated and will be removed in the future. Use ``result.fmin`` instead.
- ``Data.set_data_range`` is deprecated and will be removed in the future. Use ``with_range`` instead.
- ``Space`` has many deprecated methods, such as ``rect_limits`` and quite a few more. The full discussion can be found `here <https://github.com/zfit/zfit/discussions/533>`_.
- ``fixed_params`` in ``create_sampler`` is deprecated and will be removed in the future. Use ``params`` instead.
- ``fixed_params`` attribute of the ``Sampler`` is deprecated and will be removed in the future. Use ``params`` instead.
- ``uncertainties`` in ``GaussianConstraint`` is deprecated and will be removed in the future. Use either explicitly ``sigma`` or ``cov``.
- the ``ComposedParameter`` and ``ComplexParameter`` argument ``value_fn`` is deprecated in favor of the new argument ``func``. Identical behavior.
- ``zfit.run(...)`` is deprecated and will be removed in the future. Simply remove it should work in most cases.
(if an explicity numpy, not just array-like, cast is needed, use ``np.asarray(...)``. But usually this is not needed). This function is an old relic from the past TensorFlow 1.x, ``tf.Session`` times and is not needed anymore. We all remember well these days :)

Bug fixes and small changes
---------------------------
- complete overhaul of partial integration that used some broadcasting tricks that could potentially fail. It uses now a dynamic while loop that _could_ be slower but works for arbitrary PDFs and no problems should be encountered anymore.
- ``FitResult`` can now be used as a context manager, which will automatically set the values of the parameters to the best fit values and reset them to the original values after the context is left. A new method ``update_params`` allows to update the parameters with the best fit values explicitly.
- ``result.fmin`` now returns the full likelihood, while ``result.fminopt`` returns the optimized likelihood with potential constant subtraction. The latter is mostly used by the minimizer and other libraries. This behavior is consistent with the behavior of other methods in the loss that return by default the full, unoptimized value.
- serialization only allowed for one specific limit (space) of each obs. Multiple, independent
limits can now be serialized.
- Increased numerical stability: this was compromised due to some involuntary float32 conversions in TF. This has been fixed.
- arguments ``sigma`` and ``cov`` are now used in ``GaussianConstraint``, both mutually exclusive, to ensure the intent is clear.
- improved hashing and precompilation in loss, works now safely also with samplers.
- seed setting is by default completely randomized. This is a change from the previous behavior where the seed was set to a more deterministic value. Use seeds only for reproducibility and not for real randomness, as some strange correlations between seeds have been observed. To guarantee full randomness, just call ``zfit.run.set_seed()`` without arguments.
- ``zfit.run.set_seed`` now returns the seed that was set. This is useful for reproducibility.

Experimental
------------

- a simple ``plot`` mechanism has been added with ``pdf.plot.plotpdf`` to plot PDFs. This is simple and fully interacts with matplotlib, allowing to plot quickly in a more interactive way.
- ``zfit.run.experimental_disable_param_update``: this is an experimental feature that allows to disable the parameter update in a fit as is currently done whenever ``minimize`` is called. In conjunction with the new method ``update_params()``, this can be used as ``result = minimizer.minimize(...).update_params()`` to keep the same behavior as currently. Also, the context manager of ``FitResult`` can be used to achieve the same behavior in a context manager (with minimizer.minimize(...) as result: ...) also works.

Requirement changes
-------------------
- upgrade to TensorFlow 2.16 and TensorFlow Probability 0.24

Thanks
------
- huge thanks to ikrommyd (Iason Krommydas) for the addition of various PDFs and to welcome him on board as a new contributor!
- anjabeck for the addition of the ``ChiSquared`` PDF

0.18.2

========================

Hotfix for missing dependency attrs

Page 1 of 11

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.