Ml-tooling

Latest version: v0.12.1

Safety actively analyzes 619159 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

0.12.0

- Permutation importance and Feature importance are now two different plotting methods.
- `Model.test_estimators` now takes a `feature_pipeline` argument
- Fixed a bug where `FillNA` did not create a `_is_na` column if the column didn't have a missing value
- Implemented Bayesian Search for hyperparameter optimization
- Added a `read_file` convenience method to `FileDataset` to read
- Fixed a bug where `copy_to` failed between two instances of Sqlite based SQLDatasets
- Fixed a bug where `ClassificationVisualize.confusion_matrix` would fail on multi-class problems due to wrong defaults
- Added __repr__ to demodataset
- Lift curve now can plot multi-class
- Precision-Recall curve can now plot multi-class
- ROC AUC curve can now plot multi-class
- Fixed Binner to have a default value
- Fixed FuncTransform to have a default value
- `load_estimator` now uses default storage if nothing is passed
- `Model.bayessearc` is now `Model.bayesiansearch`
- Added `target_feature_distribution`to `Dataset.plot`

0.11.0

- Added `load_demo_dataset` function
- If the dataset has no train set `score_estimator` will now run `create_train_test` with default configurations
- `Model.make_prediction` now takes a threshold argument when making a binary classification
- All ML-tooling logging messages now go to stdout instead of stderr
- Can pass a feature pipeline to `Model` which will then automatically generate a
combined feature_pipeline + estimator Pipeline
- Can pass a feature pipeline to `Dataset.plot` methods, to apply preprocessing
before visualization
- New config implementation. If you need to reset the configuration, you should use `Model.config.reset_config()`

0.10.3

- Fixed typehints in Dataset
- Dataset.create_train_test now takes a boolean `stratify` parameter.
- Added default local filestorage when using `save_estimator`
- The dataframe returned by `.make_prediction`now labels the columns in a more
human friendly manner
- Dataset now verifies that `load_training_data` and `load_prediction_data` do not return empty
- Added a missing data visualization to `Dataset.plot`
- FillNA now accepts a `is_nan`flag which adds a flag indicating that a value was missing
- `Model.make_prediction` now accepts a `use_cache`flag to score everything in cached `.x`
- Added a new Transformer: `RareFeatureEncoder`

0.10.2

- Fixed type inferences from data to sql in _load_data
- Added idx arg to load_prediction_data abstract method in SQLDataset
- Added caching of loaded data in SQLDataset

0.10.1

- Added `.copy_to` functionality to SQLDataset and FileDataset,
allowing copying between datasets

0.10.0

- Bug fix for calculating feature importance when passing large amounts of data
- Bug fix when using default metric in `test_estimators`
- Bug fix when gridsearching, only applying last change
- Add nicer error message when passing incorrect dtypes to FillNA
- Storage .save method now only takes filename as parameter
- Handles storage loading of paths outputted from the Storage .get_list method
- Handles case when Dataset does not have a `y` value
- Added `plot_learning_curve` and corresponding `result.plot.learning_curve`
- Added `plot_validation_curve` and corresponding `result.plot.validation_curve`
- Replaced `permutation_importance` with scikit-learn's implementation
- Added `target_correlation` plots to Dataset.plot

Page 1 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.