Miceforest

Latest version: v5.7.0

Safety actively analyzes 618891 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

5.6.0

This release implemented some major changes:
* Implemented `MeanMatchScheme`
* Implemented mean matching on shap values
* Tighter controls and warnings around categorical levels
* Included type hints for major functions.

This release is marked as stable because the API will not see significant changes in the future.

5.0.0

* New main classes (`ImputationKernel`, `ImputedData`) replace (`ImputationKernel`, `ImputationKernel`, `ImputedDataSet`, `MultipleImputedDataSet`).
* Data can now be referenced and imputed in place. This saves a lot of memory allocation and is much faster.
* Data can now be completed in place. This allows for only a single copy of the dataset to be in memory at any given time, even if performing multiple imputation.
* mean_match_subset parameter has been replaced with data_subset. This subsets the data used to build the model as well as the candidates.
* More performance improvements around when data is copied and where it is stored.
* Raw data is now stored as the original. Can handle pandas DataFrame and numpy ndarray.

4.0.0

This release improved a number of areas:
* Huge performance improvements, especially if categorical variables were being imputed. These come from not predicting candidate data if we don't need to, using a much faster neighbors search, using numpy internally for indexing instead of pandas, and others.
* Ability to tune parameters of models, and use best parameters for mice.
* Improvements to code layout - got rid of ImputationSchema.
* Raw data is now stored as a numpy array to save space and improve indexing.
* Numpy arrays can be imputed, if you want to avoid pandas.
* Options of multiple build-in mean matching functions.
* Mean matching functions can handle most lightgbm objectives.

3.0.0

This is a major release, with breaking API changes:
* The random forest package is now lightgbm
- Much more lightweight (serialized kernels tend to be 5x smaller or more)
- Much faster on big datasets (for comparable parameters)
- More flexible... We can now use gbdt if we wish. lightgbm is more flexible in general.
* Added a mean_match_subset parameter. This will help greatly speed up many processes.
* mean_match_candidates now lazily accepts dicts as long as the keys are a subset of parameters in variable_schema.
* Model parameters can be specified by variable, or globally.
* Mean matching function can be overwritten if the user wishes.

2.0.1

* Models from all iterations can be saved with save_models == 2.
* Kernel classes inherit from base imputed classes - allows for methods to be called on imputed datasets obtained form impute_new_data().
* Time log was added
* MultipleImputedDataset is now a collection of ImputedDataSets with methods for comparing them. Subscripting gives the desired dataset.
* Tests updated to be much more comprehensive
* Datasets can now be added and removed from a MultipleImputedDataSet/MultipleImputedKernel.

1.0.8

Automatic testing, coverage, and formatting has been implemented. Code is (reasonably) bug free.

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.