- Performance enhancement. It is now possible to disable some heavy resource operations and achieve better performances (see also 76): - Correlation checking by turning `check_correlation` to `False` (43) - Recoded checking by turning `check_recoded` to `False`. - Possibility to install using conda - Implementation of a new Boolean variable type (25) - Add new badges for zeros and highly skewed (63) - Code refactoring (internal improvement) to split on main modules in 4 modules (65) - Improve types handling - types like `list`, `tuple` and `dict` are now officially unsupported until we improve them - mixed columns are also correctly handled - New Binary variable type supporting native `boolean` type and also binary numeric values (77) - Warnings column names have link to corresponding detail in variables section in order to ease the navigation (66) - Spearman and Pearson Correlation matrix diagrams added in the report (83)
Bug fixes
- 56 Incorrect calculation for % unique for variables with missing values bug - 11 Avoid to throw an error when calling `get_rejected_variables` while correlation has not been computed - 68 Avoid to set the matplotlib backend if not necessary
1.4.0
Bug fixes and new check for recoded categorical variables. Thanks to all who contributed!
v.1.3.0 New additions include frequency counts and extreme values for numeric variables. Pandas-profiling now does all 1d-calculations in a multitprocessing fashion, _vastly_ speeding up runtime.
1.2.0
What's new: - histograms for date variables - bug fixes