Tianshou

Latest version: v1.0.0

Safety actively analyzes 628903 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 5

0.2.4

Algorithm Implementation
1. n_step returns for all Q-learning based algorithms; (51)
2. Auto alpha tuning in SAC (80)
3. Reserve `policy._state` to support saving hidden states in replay buffer (19)
4. Add `sample_avail` argument in ReplayBuffer to sample only available index in RNN training mode (19)

New Feature
1. Batch.cat (87), Batch.stack (93), Batch.empty (106, 110)
2. Advanced slicing method of Batch (106)
2. `Batch(kwargs, copy=True)` will perform a deep copy (110)
4. Add `random=True` argument in collector.collect to perform sampling with random policy (78)

API Change
1. `Batch.append` -> `Batch.cat`
2. Remove atari wrapper to examples, since it is not a key feature in tianshou (124)
3. Add some pre-defined nets in `tianshou.utils.net`. Since we only define API instead of a class, we do not present it in `tianshou.net`. (123)

Docs
Add cheatsheet: https://tianshou.readthedocs.io/en/latest/tutorials/cheatsheet.html

0.2.3

Enhancement
1. Multimodal obs (also support any type obs) (38, 69)
2. Batch over Batch
3. preprocess_fn (42)
4. Type annotation
5. batch.to_torch, batch.to_numpy
6. pickle support for batch
Fixed Bugs
1. SAC/PPO diag gaussian
2. PPO orthogonal init
3. DQN zero eps
4. Fix type infer in replay buffer

0.2.2

Algorithm Implementation

1. Generalized Advantage Estimation (GAE);
2. Update PPO algorithm with arXiv:1811.02553 and arXiv:1912.09729;
3. Vanilla Imitation Learning (BC & DA, with continuous/discrete action space);
4. Prioritized DQN;
5. RNN-style policy network;
6. Fix SAC with torch==1.5.0

API change
1. change `__call__` to `forward` in policy;
2. Add `save_fn` in trainer;
3. Add `__repr__` in tianshou.data, e.g. `print(buffer)`

0.2.1

First version with full documentation.
Support algorithms: DQN/VPG/A2C/DDPG/PPO/TD3/SAC

Page 5 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.