WARNING: BREAKING CHANGES!
Note that several changes in triage 5 break backwards compatibility with triage 4. If you are upgrading a project from an earlier version of triage, it is **highly recommended that you first create a backup of your current database!**
These breaking changes include:
- Revision in the way the `model_hash` is calculated means that if you're re-running an experiment from an earlier version of triage, it will re-train your models and give them new `model_id`s even if the configuration hasn't changed.
- The `built_by_experiment` column has been removed from `triage_metadata.models` in preference of tracking the specific run that built the model. The `experiment_hash` can still be obtained by joining to `triage_metadata.triage_runs` (née `triage_metadata.experiment_runs`). Should you need the data that was in this column at the time of migration, it can be found in `triage_metadata.deprecated_models_built_by_experiment`, but it will not be restored to the table upon database downgrade.
- Changes in the structure of matrix metadata means the `matrix_hash` will no longer be backwards-compatible with oder version of triage (as with models, re-running an old config would result in matrices being re-created)
- The `random_seed` column has been removed from `triage_metadata.experiments` in preference of tracking it at the run level as well. A database upgrade followed by a downgrade would lose this data (but could be recovered from the runs table)
New Functionality
- Functionality for predicting forward, either with an existing model object or by retraining a new model with the most current data given a `model_group_id` (631)
- Utility for adding predictions to models previously trained/tested with `save_predictions=False` (836)
- Provisioner for easily setting up a postgresql database (via docker) that can be used with triage (840)
- More flexibility in parallelization for more resource-intensive model types, like random forests (853)
Bug Fixes
- Ensure model-level random seeds are re-used when the config and experiment-level random seed are unchanged (848)
- Remove the project path from the `model_hash` definition: the `model_id` shouldn't depend on where `triage` is being run (830)
- Ensure that feature groups are sorted in matrix metadata for consistency in downstream calculations (833)
Thanks To
tweddielin, thcrock, ecsalomon, KasunAmare