Mlagents

Latest version: v1.0.0

Safety actively analyzes 629723 Python packages for vulnerabilities to keep your Python projects secure.

Page 12 of 14

0.3.1b

Fixes

* Behavioral cloning fix (use stored info rather than previous info)
* Value Bootstrap fixed for ppo

0.3.1a

Fixes

* Remove links to out of date Unity Packages
* Fix to the CoreInternalBrain for discrete vector observations
* Retraining of the Basic Environment
* Fixed the normalization of images in the internal brain

0.3.0

Environments

To learn more about new and improved environments, see our [Example Environments page](../master/docs/Learning-Environment-Examples.md).

New

* **Soccer Twos** - Multi-agent competitive and cooperative environment where behavior comes about because of reward function. Used to demonstrate multi-brain training.

* **Banana Collectors** - Multi-agent resource collection environment where competitive or cooperative behavior comes about dynamically based on available resources. Used to demonstrate Imitation Learning.

* **Hallway** - Single agent environment in which an agent must explore a room, remember the object within the room, and use that information to navigate to the correct goal. Used to demonstrate LSTM models.

* **Bouncer** - Single agent environment provided as an example of our new On-Demand Decision-Making feature. In this environment, an agent can apply force to itself in order to bounce around a platform, and attempt to collide with floating bananas.

Improved

* All environments have been visually refreshed with a consistent color pallet and design language.
* Revamped GridWorld to only use visual observations and a 5x5 grid by default.
* Revamped Tennis to use continuous actions.
* Revamped Push Block to use local perception.
* Revamped Wall Jump to use local perception.
* Added Hard version of 3DBall which doesn’t contain velocity information in observations.

New Features

* **[Unity]** On Demand Decision Making - It is now possible to have agents only request decisions from their brains when necessary, using `RequestDecision()` and `RequestAction()`. For more information, see [here](../master/docs/Learning-Environment-Design-Agents.mdon-demand-decision-making).
* **[Unity]** Added vector-observation stacking - The past n vector observations for each agent can now be stored and used as input to a Brain for decision making.
* **[Python]** Added Behavioral Cloning (Imitation Learning) algorithm - Train a neural network to imitate either player behavior or a hand-coded game bot using behavioral cloning. For more info, see [here](../master/docs/Training-Imitation-Learning.md).
* **[Python]** Support for training multiple brains simultaneously - Two or more different brains can now be trained simultaneously using the provided PPO algorithm.
* **[Python]** Added LSTM models - We now support training and embedding recurrent neural networks using the PPO algorithm. This allows for learning temporal dependencies between observations.
* **[Unity] [Python]** Added Docker Image for RL-training - We now provide a Docker image which allows users to train their brains in an isolated environment without the need to install Python, TensorFlow, and other dependencies. For more information, see [here](../master/docs/Using-Docker.md).
* **[Python]** Ability to provide random seed to training process and environment - Allows for reproducible experimentation. For more information, see [here](../master/docs/Python-API.mdloading-a-unity-environment). (Note: Unity Physics is non-deterministic, as such fully-reproducible experiments are currently not possible when using physics based interactions.)

Changes

* **[Unity]** Memory size has been removed as a user-facing brain parameter. It is now defined when creating models from `unitytrainers`.
* **[Unity] [Python]** The API as well as the general semantics used throughout ML-Agents has changed. See [here](../master/docs/Migrating-v0.3.md) for information on these changes, and how to easily adjust current projects to be compatible with these changes.
* **[Python]** Training hyperparameters are now defined in a `.yaml` file instead of via command line arguments.
* **[Python]** Training now takes place via learn.py, which launches trainers for multiple brains.
* **[Python]** Python 2 is no longer supported.

Documentation

Documentation has been significantly re-written to include many new sections, in addition to updated tutorials and guides. Check it out [here](../master/docs).

Fixes & Performance Improvements

* **[Unity]** Improved memory management - Reduced garbage collection memory usage by up to 5x when using External Brain.
* **[Unity]** Time.captureFramerate is now set by default to help sync Update and FixedUpdate frequencies.
* **[Unity]** Added tooltips to relevant inspector objects.
* **[Unity]** It is now possible to instantiate and destroy GameObjects which are Agents.
* **[Unity]** Improved visual observation inference time by 3x.
* **[Unity]** Tooltips added to Unity Inspector for ML-Agents variables and functions.
* **[Unity] [Python]** Epsilon is now a built-in part of PPO graph. It is no longer necessary to specify it additionally in “Graph Placeholders” from Unity.
* **[Python]** Changed value bootstrapping in PPO algorithm to properly calculate returns on episode time-out.
* **[Python]** The neural network graph is now automatically saved as a `.bytes` file when training is interrupted.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3, as well as:

asolano, LionH, MarcoMeter, srcnalt, wouterhardeman, 60days, floAr, Coac, Zamaroht, slightperturbation

0.2.1d
Fixes
* Fixes bug where visual observations could not be used with PPO.

0.3.0preview

0.3.0b

Fixes

* Fixes internal brain for Banana Imitation.
* Fixes Discrete Control training for Imitation Learning.
* Fixes Visual Observations in internal brain with non-square inputs.

0.3.0a

Fixes
Added the missing Ray Perception components to the agents in the BananaImitation scene.

Page 12 of 14

Releases

Has known vulnerabilities

Previous Next

Mlagents

Page 12 of 14

0.3.1b

0.3.1a

0.3.0

0.3.0preview

0.3.0b

0.3.0a

Page 12 of 14

Links

Releases