Since at this point, the code has largely changed from v0.2.0, we release version 0.3 from now on.
API Change
1. add policy.updating and clarify collecting state and updating state in training (224)
2. change `train_fn(epoch)` to `train_fn(epoch, env_step)` and `test_fn(epoch)` to `test_fn(epoch, env_step)` (229)
3. remove out-of-the-date API: collector.sample, collector.render, collector.seed, VectorEnv (210)
Bug Fix
1. fix a bug in DDQN: target_q could not be sampled from np.random.rand (224)
2. fix a bug in DQN atari net: it should add a ReLU before the last layer (224)
3. fix a bug in collector timing (224)
4. fix a bug in the converter of Batch: deepcopy a Batch in to_numpy and to_torch (213)
5. ensure buffer.rew has a type of float (229)
Enhancement
1. Anaconda support: `conda install -c conda-forge tianshou` (228)
1. add PSRL (202)
2. add SAC discrete (216)
3. add type check in unit test (200)
4. format code and update function signatures (213)
5. add pydocstyle and doc8 check (210)
6. several documentation fix (210)