Sapsan

Latest version: v0.6.5

Safety actively analyzes 630217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 7

0.4.7

Changes

Data Loading
* _Fixed_: using multiple checkpoints when `batch_num` is not specified
- before data would load from all checkpoints, but only the 1st one would be used for training by default

Training log plot
* _Fixed_: if training log fails, the plot was producing and error, with the trained model not being saved
- happened sometimes during the long training sessions
- log plotting no longer affects model output

0.4.6

Changes

Filters
* _Fixed_: compatibility with the latest opencv-python ( >=4.5.4)
- 2D box filter is now working correctly

Subgrid Model
* _Updated_: Dynamic Smagorinsky model
- improved initialization
- added 2D support

Evaluation
* _Removed_: a redundant parameter requirement in `Evaluate()`
- 'flat' condition is now passed along with other input data parameters in data_loader

0.4.5

Changes

Distributed Data Parallel (DDP)
* _Fixed_: DistributedDataParallel (DDP)
- `engine` is no longer overwritten
- will be determined automatically by Catalist if `ddp=True`
- can always be customized by hand ([Parallel GPU Training](https://github.com/pikarpov-LANL/Sapsan/wiki/parallel-GPU-Training))
* _Updated_: device in Run Info better reflects if attempting to run in parallel

MLflow
* Returned auto-termination of MLflow tracking after `Evaluation.run()`
- cleans up MLflow logging (was getting messy before, when loading the model to evaluate without training)

Plotting
* _Updated_: default plot formatting
- colormaps are colorblind friendly now (tableau-colorblind10 and viridis)
- log ticks are inward + size adjustment for a cleaner look
- thin dotted default grid
- you can always call `sapsan.utils.plot.plot_params()`, which returns the full set of default parameters

* _Updated_: spectrum_plot formatting for consistency with other plotting routines
- Renamed: `plot_spectrum()` -> `spectrum_plot()`
- Now returns `Axes` object, as do others

Command Line Interface (CLI)
* _Renamed_: `--ddp` -> `--gtb` or `--get_torch_backend` option for `sapsan`
- to copy torch_backend.py when creating the project: `sapsan create --gtb -n {name}`
* _New Command_: `sapsan get_torch_backend`
- copies torch_backend.py into your working directory
- this allows you to not have to 'create' a project to copy the backend
- you can proceed to edit the Catalyst runner (custom loss, optimizer, DDP config, etc.)

Custom Estimator
* _Added_: a guide on how to go deeper and [edit Catalyst runner](https://github.com/pikarpov-LANL/Sapsan/wiki/Custom-Estimator#editing-catalyst-runner)
* _Added_: a convenient command to copy torch_backend.py in your working directory (see above)

Gradient Model
* _Fixed_: derivative multiplication
* _Fixed_: model calculation consistency

Other
- _Renamed_: tensor() -> ReynoldsStress() to avoid confusion
- Documentation updated accordingly

0.4.4

Changes

MLflow
- New parameters for `Train()` and `Evaluate()`
- `run_id` parameter after `run()` has been called
- allows to resume and record to a specific run at a later time
- `run_name` to change the recorded run names
- by default, they are `train` and `evaluate` as recorded in MLflow
- One can add new metrics/parameters/artifacts after _Train_ or _Evaluate_ have completed
- either Sapsan's backend interface or a traditional MLflow interface can be used
- Wiki update: [MLflow Tracking](https://github.com/pikarpov-LANL/Sapsan/wiki/MLflow-Tracking)
- Changes to `MLflowBackend()`
- while loop for `close_active_run()` to make sure all runs have been closed
- new function `resume()` which requires to provide the `run_id` to resume and record to the run
- `nested = True` by default
- Wiki update: [API Reference: Backend (Tracking)](https://github.com/pikarpov-LANL/Sapsan/wiki/API-Reference#backend-tracking)

Plotting
- New parameter for `cdf_plot`
- `ks` - controls to print Kolmogorov-Smirnov Statistic on the plot itself
- also outputs it as `ax, ks = cdf_plot(...)`
- New parameters for `Evaluate`
- `pdf_xlim`, `pdf_ylim` - x and y limits to control the pdf plot
- `cdf_xlim`, `cdf_ylim` - same for the cdf plot
- Fixed: `model_graph`
- no longer sets number of channels to 1
- the easiest way to construct the graph is to pass the training loader shape
- Wiki update: [Model Graph](https://github.com/pikarpov-LANL/Sapsan/wiki/Model-Graph)

Graphical User Interface (GUI)
- GUI examples are now included in PyPi
- `sapsan/examples/GUI`
- The file structure has been simplified
- unnecessary files removed
- The scripts have been cleaned up, with more comments, and a clearer function organization to aid editing
- Brought up to date with the most recent Sapsan version
- Core package has been locked to `streamlit == 0.84.2`
- there is a known bug causing pd.DataFrames to not display properly
- will update once Streamlit team fixes those issues
- Wiki update: [GUI Examples](https://github.com/pikarpov-LANL/Sapsan/wiki/Local#gui-examples)

Command Line Interface (CLI)
- Changes to `sapsan get_examples`
- GUI examples will be copied as well, found in `./sapsan-examples/GUI`

Other
- Fixed the exact device ID issue: affected the multi-GPU systems
- tensors no longer move only to the default (_cuda:0_), but to a correct device id
- Updated the requirements template

0.4.3

Changes

PyPi Release
- `./examples` directory is now included in the release
- added MANIFEST.in

Command Line Interface (CLI)
- Fixed: `sapsan get_examples` command

0.4.0

Changes

General
* Loaders for Train and Evaluate now have the same format
* The functions above have an identical interface for both PyTorch and Sklearn

Estimators
* Fixed: [Model Saving & Loading](https://github.com/pikarpov-LANL/Sapsan/wiki/Save-&-Load-Models)
- loaded models can continue to be trained
- upon their initialization when loading the model, you can redefine its previous config, such as n_epoch, lr
- optimizer dict state is correctly saved and loaded
- optimizer state is moved to cpu or gpu depending on a setup (catalyst doesn't do it on its own, which caused issues when evaluating a loaded model)
- Added dummy estimators for loading (otherwise all estimators have `load` and `save`)
- `load_estimator()` for torch
- `load_sklearn_estimator()` for sklearn

* Reworked how the models are initialized
- upon calling the estimator, ex: `estimator = CNN3d(loaders=loaders)`
- before: when training the model, upon `estimator.train`
- model initialization requires to provide `loaders` now

* All `self` vars in `ModelConfig()` get recorded in tracking by default

* Added options in `ModelConfig()`
- `lr` and `min_lr` - learning rate parameters are no longer hard-coded
- `device` - sets a specific device to run the models on, either cpu or cuda

* Added `sklearn_backend` and `torch_backend` to be used by all estimators
- sklearn-based estimators have a structure close to torch-based
- pytorch_estimator -> torch_backend
- cleared up variable name conventions throughout


Evaluation
* Evaluate and Data loader accept data without target
- useful when there is no ground truth to compare to
- will still output pdf, cdf, and spatial, without comparison metrics
- `Evaluate.run()` now output a `dict` of `"pred_cube"` and `"target_cube"` (if the latter is provided)
* PDF and CDF plots are now combined under a single figure
- recorded as 'pdf_cdf.png' in MLflow
* Fixed: definition of `n_output_channel` in Evaluate()

Command Line Interface (CLI)
* Added new option: `sapsan create --ddp` option copies `torch_backend.py`
- gives ability to customize Catalyst Runner
- adjust DDP settings based on the linked [Catalyst DDP tutorial](https://catalyst-team.github.io/catalyst/tutorials/ddp.html) in the Wiki
- will be useful when running on HPC
- refer to [Parallel GPU Training](https://github.com/pikarpov-LANL/Sapsan/wiki/parallel-GPU-Training) on the Wiki for more details

* Fixed: CLI click initialization


Graphical User Interface (GUI)
* Up to date with Streamlit 0.87.0
* PDF and CDF plots are now showed as well
* Fixed: data loading issue in regards to `train_fraction`


MLflow
* MLflow: evaluate runs will be nested under the recent train run
- significantly aids organization
* Added `estimator.model.forward()` to be recorded by MLflow (if torch is used)


Plotting
* Plotting routines return `Axes` object
* All parameters are changed for the `Axes` instead of `plt` which allows individual tweaking after `return`
* `figsize` and `ax` arguments added to most plotting routines
- useful if you create a figure and subplots outside of the plotting routines
* Universal plotting params expanded and were made easily accessible through `plot_params()`


Other
* Edited the examples, tests, and estimator template to reflect model initialization changes
* Requirements Updated:
- streamlit >= 0.87.0
- plotly >= 5.2.0
- tornado >= 6.1.0
- notebook >= 6.4.3 (fixes security vulnerabilities)
* Added a few data_loader warnings
* Cleaned up debug prints throughout the code
* Expanded code comments

Page 2 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.