Sdv

Latest version: v1.13.1

Safety actively analyzes 630656 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 10

0.11.0

This release primarily addresses bugs and feature requests related to using constraints for the single-table models.
Users can now enforce scalar comparison with the existing `GreaterThan` constraint and apply 5 new constraints: `OneHotEncoding`, `Positive`, `Negative`, `Between` and `Rounding`.
Additionally, the SDV will now auto-apply constraints for rounding numerical values, and for keeping the data within the observed bounds.
All related user guides are updated with the new functionality.

New Features

* Add OneHotEncoding Constraint - Issue [303](https://github.com/sdv-dev/SDV/issues/303) by fealho
* GreaterThan Constraint should apply to scalars - Issue [410](https://github.com/sdv-dev/SDV/issues/410) by amontanez24
* Improve GreaterThan constraint - Issue [368](https://github.com/sdv-dev/SDV/issues/368) by amontanez24
* Add Non-negative and Positive constraints across multiple columns- Issue [409](https://github.com/sdv-dev/SDV/issues/409) by amontanez24
* Add Between values constraint - Issue [367](https://github.com/sdv-dev/SDV/issues/367) by fealho
* Ensure values fall within the specified range - Issue [423](https://github.com/sdv-dev/SDV/issues/423) by amontanez24
* Add Rounding constraint - Issue [482](https://github.com/sdv-dev/SDV/issues/482) by katxiao
* Add rounding and min/max arguments that are passed down to the NumericalTransformer - Issue [491](https://github.com/sdv-dev/SDV/issues/491) by amontanez24

Bugs Fixed

* GreaterThan constraint between Date columns rasises TypeError - Issue [421](https://github.com/sdv-dev/SDV/issues/421) by amontanez24
* GreaterThan constraint's transform strategy fails on columns that are not float - Issue [448](https://github.com/sdv-dev/SDV/issues/448) by amontanez24
* AttributeError on UniqueCombinations constraint with non-strings - Issue [196](https://github.com/sdv-dev/SDV/issues/196) by katxiao
* Use reject sampling to sample missing columns for constraints - Issue [435](https://github.com/sdv-dev/SDV/issues/435) by amontanez24

Documentation Changes

* Ensure privacy metrics are available in the API docs - Issue [458](https://github.com/sdv-dev/SDV/issues/458) by fealho
* Ensure forumla constraint is called ColumnFormula everywhere in the docs - Issue [449](https://github.com/sdv-dev/SDV/issues/449) by fealho

0.10.1

This release changes the way we sample conditions to not only group by the conditions passed by the user, but also by the transformed conditions that result from them.

Issues resolved

* Conditionally sampling on variable in constraint should have variety for other variables - Issue [440](https://github.com/sdv-dev/SDV/issues/440) by amontanez24

0.10.0

This release improves the constraint functionality by allowing constraints and conditions
at the same time. Additional changes were made to update tutorials.

Issues resolved

* Not able to use constraints and conditions in the same time - Issue [379](https://github.com/sdv-dev/SDV/issues/379)
by amontanez24
* Update benchmarking user guide for reading private datasets - Issue [427](https://github.com/sdv-dev/SDV/issues/427)
by katxiao

0.9.1

This release broadens the constraint functionality by allowing for the `ColumnFormula`
constraint to take lambda functions and returned functions as an input for its formula.

It also improves conditional sampling by ensuring that any `id` fields generated by the
model remain unique throughout the sampled data.

The `CTGAN` model was improved by adjusting a default parameter to be more mathematically
correct.

Additional changes were made to improve tutorials as well as fix fragile tests.

Issues resolved

* Tutorials test sometimes fails - Issue [355](https://github.com/sdv-dev/SDV/issues/355)
by fealho
* Duplicate IDs when using reject-sampling - Issue [331](https://github.com/sdv-dev/SDV/issues/331)
by amontanez24 and csala
* discriminator_decay should be initialized at 1e-6 but it's 0 - Issue [401](https://github.com/sdv-dev/SDV/issues/401) by fealho and YoucefZemmouri
* Tutorial typo - Issue [380](https://github.com/sdv-dev/SDV/issues/380) by fealho
* Request for sdv.constraint.ColumnFormula for a wider range of function - Issue [373](https://github.com/sdv-dev/SDV/issues/373) by amontanez24 and JetfiRex

0.9.0

This release brings new privacy metrics to the evaluate framework which help to determine
if the real data could be obtained or deduced from the synthetic samples.
Additionally, now there is a normalized score for the metrics, which stays between `0` and `1`.

There are improvements that reduce the usage of memory ram when sampling new data. Also there
is a new parameter to control the reject sampling crash, `graceful_reject_sampling`, which if
set to `True` and if it's not possible to generate all the requested rows, it will just issue a
warning and return whatever it was able to generate.

The `Metadata` object can now be visualized using different combinations of `names` and `details`,
which can be set to `True` or `False` in order to display only the table names with details or
without. There is also an improvement on the `validation`, which now will display all the errors
found at the end of the validation instead of only the first one.

This version also exposes all the hyperparameters of the models `CTGAN` and `TVAE` to allow a more
advanced usage. There is also a fix for the `TVAE` model on small datasets and it's performance
with `NaN` values has been improved. There is a fix for when using
`UniqueCombinationConstraint` with the `transform` strategy.

Issues resolved

* Memory Usage Gaussian Copula Trained Model consuming high memory when generating synthetic data - Issue [304](https://github.com/sdv-dev/SDV/issues/304) by pvk-developer and AnupamaGangadhar
* Add option to visualize metadata with only table names - Issue [347](https://github.com/sdv-dev/SDV/issues/347) by csala
* Add sample parameter to control reject sampling crash - Issue [343](https://github.com/sdv-dev/SDV/issues/343) by fealho
* Verbose metadata validation - Issue [348](https://github.com/sdv-dev/SDV/issues/348) by csala
* Missing the introduction of custom specification for hyperparameters in the TVAE model - Issue [344](https://github.com/sdv-dev/SDV/issues/343) by imkhoa99 and pvk-developer

0.8.0

This version adds conditional sampling for tabular models by combining a reject-sampling
strategy with the native conditional sampling capabilities from the gaussian copulas.

It also introduces several upgrades on the HMA1 algorithm that improve data quality and
robustness in the multi-table scenarios by making changes in how the parameters of the child
tables are aggregated on the parent tables, including a complete rework of how the correlation
matrices are modeled and rebuild after sampling.

Issues resolved

* Fix probabilities contain NaN error - Issue [326](https://github.com/sdv-dev/SDV/issues/326) by csala
* Conditional Sampling for tabular models - Issue [316](https://github.com/sdv-dev/SDV/issues/316) by fealho and csala
* HMA1: LinAlgError: SVD did not converge - Issue [240](https://github.com/sdv-dev/SDV/issues/240) by csala

Page 6 of 10

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.