Mlagents

Latest version: v1.0.0

Safety actively analyzes 629723 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 11 of 14

0.4.0

Environments

To learn more about new and improved environments, see our [Example Environments page](../master/docs/Learning-Environment-Examples.md).

New

* **Walker** - Humanoid physics based agent. The agents must move its body toward the goal direction as quickly as possible without falling.

* **Pyramids** - Sparse reward environment. The agent must press a button, then topple a pyramid of blocks to get the golden brick at the top. Used to demonstrate Curiosity.

Improved

* Revamped the Crawler environment

* Added visual observation based scenes for :
* BananaCollector
* PushBlock
* Hallway
* Pyramids

* Added Imitation Learning based scenes for :
* Tennis
* Bouncer
* PushBlock
* Hallway
* Pyramids

New Features

* **[Unity]** In Editor Training - It is now possible to train agents directly in the editor without building the scene. For more information, see [here](../master/docs/Basic-Guide.mdtraining-the-brain-with-reinforcement-learning).

* **[Training]** Curiosity-Driven Exploration - Addition of curiosity-based intrinsic reward signal when using PPO. Enable by setting `use_curiosity` brain training hyperparameter to `true`.

* **[Unity]** Support for providing player input using axes within the Player Brain.

* **[Unity]** TensorFlowSharp Plugin has been upgraded to version 1.7.1.

Changes
* Main ML-Agents code now within `MLAgents` namespace. Ensure that the `MLAgents` namespace is added to necessary project scripts such as Agent classes.
* ASCII art added to `learn.py` script.
* Communication now uses gRPC and Protobuf. JSON libraries removed.
* TensorBoard now reports mean absolute loss as opposed to total loss update loop.
* PPO algorithm now uses wider gaussian output for Continuous Control models (increasing performance).

Documentation
* Added Quick Start and & FAQ sections to the documentation.
* Added documentation explaining how to use ML-Agents on Microsoft Azure.
* Added benchmark reward thresholds for example environments.

Fixes & Performance Improvements
* Episode length is now properly reported in TensorBoard in the first episode.
* Behavioral Cloning now works with LSTM models.

Known Issues
* Curiosity-driven exploration does not function with On-Demand Decision Making. Expect a fix in v0.4.0a.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.4, as well as: sterlingcrispin, ChrisRisner, akmadian, animaleja32, LeighS, and 5665tm.

0.4.0preview

0.4.0b

Fixes & Performance Improvements
* Corrects observation space description for PushBlock environment.
* Fixes bug preventing using environments with python multi-processing.
* Fixes bug preventing agents to be initialized without a brain.

0.4.0a

Environments
* Changes to example environments for visual consistency.

Documentation
* Adjustments to Windows installation documentation.
* Updates documentation to refer to project as a toolkit.

Changes
* New Amazon Web Service AMI.
* Uses `swish` for continuous control activation function.
* Corrected version number in `setup.py`.

Fixes & Performance Improvements
* Fixes memory leak bug when using visual observations.
* Fixes use of behavioral cloning with visual observations.
* Fixes use of curiosity-driven exploration with on-demand decision making.
* Optimize visual observations when using internal brain.

Acknowledgements
Thanks to everyone at Unity who contributed to v0.4.0a, as well as: tcmxx

0.3.1

Features

* We have upgraded our Docker contain, which now supports Brains which contain camera-based Visual Observations.

Documentation

* We have added a partial Chinese translation of our documentation. It is available [here](../master/docs/localized/zh-CN).

Fixes & Performance Improvements

* Missing component reference in BananaRL environment.
* Neural Network for multiple visual observations was not properly generated.
* Episode time-out value estimate bootstrapping used incorrect observation as input.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3.1, as well as to the following community contributors:

sterlingcrispin, andersonaddo, palomagr, imankgoyal, luchris429.

0.3.1preview

Page 11 of 14

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.