Tianshou

Latest version: v1.0.0

Safety actively analyzes 628903 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 5

0.4.7

Bug Fix

1. Add map_action_inverse for fixing the error of storing random action (568)

API Change

1. Update WandbLogger implementation and update Atari examples, use Tensorboard SummaryWritter as core with `wandb.init(..., sync_tensorboard=True)` (558, 562)
2. Rename save_fn to save_best_fn to avoid ambiguity (575)
3. (Internal) Add `tianshou.utils.deprecation` for a unified deprecation wrapper. (575)

New Features

1. Implement Generative Adversarial Imitation Learning (GAIL), add Mujoco examples (550)
2. Add Trainers as generators: OnpolicyTrainer, OffpolicyTrainer, and OfflineTrainer; remove duplicated code and merge into base trainer (559)

Enhancement

1. Add imitation baselines for offline RL (566)

0.4.6.post1

This release is to fix the conda pkg publish, support more gym version instead of only the newest one, and keep compatibility of internal API. See 536.

0.4.6

Bug Fix

1. Fix casts to int by to_torch_as(...) calls in policies when using discrete actions (521)

API Change

1. Change venv internal API name of worker: send_action -> send, get_result -> recv (align with envpool) (517)

New Features

1. Add Intrinsic Curiosity Module (503)
3. Implement CQLPolicy and offline_cql example (506)
4. Pettingzoo environment support (494)
5. Enable venvs.reset() concurrent execution (517)

Enhancement

1. Remove reset_buffer() from reset method (501)
7. Add atari ppo example (523, 529)
8. Add VizDoom PPO example and results (533)
9. Upgrade gym version to >=0.21 (534)
10. Switch atari example to use EnvPool by default (534)

Documentation

1. Update dqn tutorial and add envpool to docs (526)

0.4.5

Bug Fix

1. Fix tqdm issue (481)
2. Fix atari wrapper to be deterministic (467)
3. Add `writer.flush()` in TensorboardLogger to ensure real-time logging result (485)

Enhancement

1. Implements set_env_attr and get_env_attr for vector environments (478)
2. Implement BCQPolicy and offline_bcq example (480)
3. Enable `test_collector=None` in 3 trainers to turn off testing during training (485)
3. Fix an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies (485)
4. Update several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic (485)
5. Move Atari offline RL examples to `examples/offline` and tests to `test/offline` (485)

0.4.4

API Change

1. add a new class DataParallelNet for multi-GPU training (461)
2. add ActorCritic for deterministic parameter grouping for share-head actor-critic network (458)
3. collector.collect() now returns 4 extra keys: rew/rew_std/len/len_std (previously this work is done in logger) (459)
4. rename WandBLogger -> WandbLogger (441)

Bug Fix

1. fix logging in atari examples (444)

Enhancement

1. save_fn() will be called at the beginning of trainer (459)
2. create a new page for logger (463)
3. add save_data and restore_data in wandb, allow more input arguments for wandb init, and integrate wandb into test/modelbase/test_psrl.py and examples/atari/atari_dqn.py (441)

0.4.3

Bug Fix

1. fix a2c/ppo optimizer bug when sharing head (428)
2. fix ppo dual clip implementation (435)

Enhancement

1. add Rainbow (386)
2. add WandbLogger (427)
3. add env_id in preprocess_fn (391)
4. update README, add new chart and bibtex (406)
5. add Makefile, now you can use `make commit-checks` to automatically perform almost all checks (432)
6. add isort and yapf, apply to existing codebase (432)
7. add spelling check by using `make spelling` (432)
8. update contributing.rst (432)

Page 2 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.