Tianshou

Latest version: v1.0.0

Safety actively analyzes 628918 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 5

0.3.0

Since at this point, the code has largely changed from v0.2.0, we release version 0.3 from now on.

API Change
1. add policy.updating and clarify collecting state and updating state in training (224)
2. change `train_fn(epoch)` to `train_fn(epoch, env_step)` and `test_fn(epoch)` to `test_fn(epoch, env_step)` (229)
3. remove out-of-the-date API: collector.sample, collector.render, collector.seed, VectorEnv (210)

Bug Fix
1. fix a bug in DDQN: target_q could not be sampled from np.random.rand (224)
2. fix a bug in DQN atari net: it should add a ReLU before the last layer (224)
3. fix a bug in collector timing (224)
4. fix a bug in the converter of Batch: deepcopy a Batch in to_numpy and to_torch (213)
5. ensure buffer.rew has a type of float (229)

Enhancement
1. Anaconda support: `conda install -c conda-forge tianshou` (228)
1. add PSRL (202)
2. add SAC discrete (216)
3. add type check in unit test (200)
4. format code and update function signatures (213)
5. add pydocstyle and doc8 check (210)
6. several documentation fix (210)

0.3.0rc0

This is a pre-release for testing anaconda.

0.2.7

API Change

1. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (184)
2. add `save_only_last_obs` for replay buffer in order to save the memory. (184)
3. remove default value in batch.split() and add merge_last argument (185)
4. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard (189)
5. add max_batchsize in onpolicy algorithms (189)
6. keep only sumtree in segment tree implementation (193)
7. add `__contains__` and `pop` in batch: `key in batch`, `batch.pop(key, deft)` (189)
8. remove dict return support for collector preprocess_fn (189)
9. remove `**kwargs` in ReplayBuffer (189)
10. add no_grad argument in collector.collect (204)

Enhancement

1. add DQN Atari examples (187)
2. change the type-checking order in batch.py and converter.py in order to meet the most often case first (189)
3. Numba acceleration for GAE, nstep, and segment tree (193)
4. add policy.eval() in all test scripts' "watch performance" (189)
5. add test_returns (both GAE and nstep) (189)
6. improve the code-coverage (from 90% to 95%) and remove the dead code (189)
7. polish examples/box2d/bipedal_hardcore_sac.py (207)

Bug fix

1. fix a bug in MAPolicy: `buffer.rew = Batch()` doesn't change `buffer.rew` (thanks mypy) (207)
2. ~~set policy.eval() before collector.collect (204)~~ This is a bug
3. fix shape inconsistency for torch.Tensor in replay buffer (189)
4. potential bugfix for subproc.wait (189)
5. fix RecurrentActorProb (189)
6. fix some incorrect type annotation (189)
7. fix a bug in tictactoe set_eps (193)
8. dirty fix for asyncVenv check_id test

0.2.6

API Change
1. Replay buffer allows stack_num = 1 (165)
8. add policy.update to enable post process and remove collector.sample (180)
10. Remove `collector.close` and rename `VectorEnv` to `DummyVectorEnv` (179)

Enhancement
1. Enable async simulation for all vector envs (179)
7. Improve PER (159): use segment tree and enable all Q-learning algorithms to use PER
1. unify single-env and multi-env in collector (157)
8. Pickle compatible for replay buffer and improve buffer.get (182): fix 84 and make buffer more efficient
5. Add ShmemVectorEnv implementation (174)
3. Add Dueling DQN implementation (170)
4. Add profile workflow (143)
6. Add BipedalWalkerHardcore-v3 SAC example (177) (about 1 hour it is well-trained)

Bug fix
1. fix 162 of multi-dim action (160)

> Note: 0.3 is coming soon!

0.2.5

New feature

Multi-agent Reinforcement Learning: https://tianshou.readthedocs.io/en/latest/tutorials/tictactoe.html (#122)

Documentation

Add a tutorial of Batch class to standardized the behavior of Batch: https://tianshou.readthedocs.io/en/latest/tutorials/batch.html (#142)

Bugfix

- Fix inconsistent shape in A2CPolicy and PPOPolicy. Please be careful when dealing with log_prob (155)
- Fix list of tensors inside Batch, e.g., `Batch(a=[np.zeros(3), torch.zeros(3)])` (147)
- Fix buffer update when `stack_num > 0` (154)
- Remove useless `kwargs`

0.2.4.post1

Several bug fix and enhancement:
- remove deprecated API `append` (126)
- `Batch.cat_` and `Batch.stack_` is now working well with inconsistent keys (130)
- Batch.is_empty now correctly recognizes empty over empty Batch (128)
- reconstruct collector: remove multiple buffer case, change the internal data to `Batch`, and add reward_metric for MARL usage (125)
- add `Batch.update` to mimic `dict.update` (128)

Page 4 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.