Catboost

Latest version: v1.2.5

Safety actively analyzes 629855 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 8 of 14

0.16.3

Not secure
Breaking changes:
- Renamed column `Feature Index` to `Feature Id` in prettified output of python method `get_feature_importance()`, because it supports feature names now
- Renamed option `per_float_feature_binarization` (`--per-float-feature-binarization`) to `per_float_feature_quantization` (`--per-float-feature-quantization`)
- Removed parameter `inverted` from python `cv` method. Added `type` parameter instead, which can be set to `Inverted`
- Method `get_features()` now works only for datasets without categorical features

New features
- A new multiclass version of AUC metric, called `AUC Mu`, which was proposed by Ross S. Kleiman on NeurIPS 2019, [link](http://proceedings.mlr.press/v97/kleiman19a/kleiman19a.pdf)
- Added time series cv
- Added `MeanWeightedTarget` in `fstat`
- Added `utils.get_confusion_matrix()`
- Now feature importance can be calculated for non-symmetric trees

0.16.2

Not secure
Breaking changes:
- Removed `get_group_id()` and `get_features()` methods of `Pool` class

New model analysis tools:
- Added `PredictionDiff` type of `get_feature_importance()` method, which is a new method for model analysis. The method shows how the features influenced the fact that among two samples one has a higher prediction. It allows to debug ranking models: you find a pair of samples ranked incorrectly and you look at what features have caused that.
- Added `plot_predictions()` method

New features:
- `model.set_feature_names()` method in Python
- Added stratified split to parameter search methods
- Support `catboost.load_model()` from CPU snapshots for numerical-only datasets
- `CatBoostClassifier.score()` now supports `y` as `DataFrame`
- Added `sampling_frequency`, `per_float_feature_binarization`, `monotone_constraints` parameters to `CatBoostClassifier` and `CatBoostRegresssor`

Speedups:
- 2x speedup of multi-classification mode

Bugfixes:
- Fixed `score()` for multiclassification, 924
- Fixed `get_all_params()` function, 926

Other improvements:
- Clear error messages when a model cannot be saved

0.16.1

Not secure
Breaking changes:
- parameter `fold_count` is now called `cv` in [`grid_search()`](https://catboost.ai/docs/concepts/python-reference_catboost_grid_search.html) and [`randomized_search`](https://catboost.ai/docs/concepts/python-reference_catboost_randomized_search.html)
- cv results are now returned from `grid_search()` and `randomized_search()` in `res['cv_results']` field

New features:
- R-language function `catboost.save_model()` now supports PMML, ONNX and other formats
- Parameter `monotone_constraints` in python API allows specifying numerical features that the prediction shall depend on monotonically

Bug fixes:
- Fixed `eval_metric` calculation for training with weights (in release 0.16 evaluation of a metric that was equal to an optimized loss did not use weights by default, so overfitting detector worked incorrectly)

Improvements:
- Added option `verbose` to `grid_search()` and `randomized_search()`
- Added [tutorial](https://github.com/catboost/tutorials/blob/master/hyperparameters_tuning/hyperparameters_tuning.ipynb) on `grid_search()` and `randomized_search()`

0.16

Not secure
Breaking changes:
- `MultiClass` loss has now the same sign as Logloss. It had the other sign before and was maximized, now it is minimized.
- `CatBoostRegressor.score` now returns the value of $R^2$ metric instead of RMSE to be more consistent with the behavior of scikit-learn regressors.
- Changed metric parameter `use_weights` default value to false (except for ranking metrics)

New features:
- It is now possible to apply model on GPU
- We have published two new realworld datasets with monotonic constraints, `catboost.datasets.monotonic1()` and `catboost.datasets.monotonic2()`. Before that there was only `california_housing` dataset in open-source with monotonic constraints. Now you can use these two to benchmark algorithms with monotonic constraints.
- We've added several new metrics to catboost, including `DCG`, `FairLoss`, `HammingLoss`, `NormalizedGini` and `FilteredNDCG`
- Introduced efficient `GridSearch` and `RandomSearch` implementations.
- `get_all_params()` Python function returns the values of all training parameters, both user-defined and default.
- Added more synonyms for training parameters to be more compatible with other GBDT libraries.

Speedups:
- AUC metric is computationally very expensive. We've implemented parallelized calculation of this metric, now it can be calculated on every iteration (or every k-th iteration) about 4x faster.

Educational materials:
- We've improved our command-line tutorial, now it has examples of files and more information.

Fixes:
- Automatic `Logloss` or `MultiClass` loss function deduction for `CatBoostClassifier.fit` now also works if the training dataset is specified as `Pool` or filename string.
- And some other fixes

0.15.2

Not secure
Breaking changes:
- Function `get_feature_statistics` is replaced by `calc_feature_statistics`
- Scoring function `Correlation` is renamed to `Cosine`
- Parameter `efb_max_conflict_fraction` is renamed to `sparse_features_conflict_fraction`

New features:
- Models can be saved in PMML format now.
> **Note:** PMML does not have full categorical features support, so to have the model in PMML format for datasets with categorical features you need to use set `one_hot_max_size` parameter to some large value, so that all categorical features are one-hot encoded
- Feature names can be used to specify ignored features

Bug fixes, including:
- Fixed restarting of CV on GPU for datasets without categorical features
- Fixed learning continuation errors with changed dataset (PR 879) and with model loaded from file (884)
- Fixed NativeLib for JDK 9+ (PR 857)

0.15.1

Not secure
Bug fixes
- restored parameter `fstr_type` in Python and R interfaces

Page 8 of 14

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.