Trains-agent

Latest version: v0.16.3

Safety actively analyzes 621688 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

0.16.3

Features and Bug Fixes
- Update PyJWT requirement (v2.0.0 breaks interface)
- Update other requirements constraints
- Change k8s pod naming scheme in k8s glue to include queue name, conform queue name to k8s standard

0.16.2

Features
- conda
- Add `agent.package_manager.conda_env_as_base_docker` allowing "docker_cmd" to contain link to a full pre-packaged conda environment (`tar.gz` created by `conda-pack`). Use `TRAINS_CONDA_ENV_PACKAGE` environment variable to specify `conda tar.gz` file.
- Add conda support for read-only pre-built environment (pass conda folder as `docker_cmd` on Task)
- Improve trying to find conda executable
- k8s glue
- Add support for limited number of services exposing ports
- Add support for k8s pod custom user properties
- Allow selecting external `trains.conf` file for the pod itself
- Allow providing pod template, extra bash init script, alternate SSH server port, gateway address (k8s ingress/ELB)
- Allow specifying `cudatoolkit` version in the "installed packages" section when using conda as package manager https://github.com/allegroai/trains/issues/229
- Add `agent.package_manager.force_repo_requirements_txt`. If True, "Installed Packages" on Task are ignored, and only repository `requirements.txt` is used
- Pass `TRAINS_DOCKER_IMAGE` into docker for interactive sessions
- Add `torchcsprng` and `torchtext` to PyTorch resolving

Bug Fixes
- When logging suppress "\r" when reading a current chunk of a file/stream. Add `agent.suppress_carriage_return` (default True) to support previous behavior
- Make sure `TRAINS_AGENT_K8S_HOST_MOUNT` is used only once per mount
- Fix k8s glue script to trains-agent default docker script
- Fix apply git diff from submodule only
- conda
- Fix conda pip freeze to be consistent with trains 0.16.3
- Fix conda environment support for trains 0.16.3 full env. Add `agent.package_manager.conda_full_env_update` to allow conda to update back the requirements (default False, to preserve previous behavior)
- Fix running from conda environment - `conda.sh` not found in first conda PATH match
- Fix docker mode ubuntu/debian support by making sure not to ask for input (fix `tzdata` install)
- Fix repository detection - ignore environment `SSH_AUTH_SOCK`, only check if git user/pass are configured
- git diff
- Fix support for non-ascii diff
- Fix diff with empty line at the end will cause corrupt diff apply message
- Allow zero context diffs (useful when blind patching repository)
- Fix `daemon --stop` when agent UID cannot be located
- Fix nvidia docker support on some linux distros (SUSE)
- Fix nvidia pytorch dockers support
- Fix torch CUDA 11.1 support
- Fix requirements dict with null entry in `pip` should be considered None install from repository's `requirements.txt`

0.16.1

Features

- Add `sdk.metrics.plot_max_num_digits` configuration option to reduce plot storage size
- Add `agent.package_manager.post_packages` and `agent.package_manager.post_optional_packages` configuration options to control packages install order (e.g. horovod)
- Add `agent.git_host` configuration option for limiting git credential usage for a specific host (overridable using `TRAINS_AGENT_GIT_HOST` environment variable)
- Add `agent.force_git_ssh_port` configuration option to control `https` to `ssh` link conversion for non standard `ssh` ports
- Add requirements detection features
* Improve support for detecting new pip version (20+) supporting `package scheme://link`

Bug Fixes

- Fix pre-installed packages are ignored when installing a git package wheel. Reinstalling a `git+http` link is enough to make sure all requirements are met/installed https://github.com/allegroai/trains/issues/196
- Fix incorrect check for spaces in current execution folder
- Fix requirements detection
* Update torch version after using downloaded / system pre-installed version
* Do not install git packages twice when a new pip version is used (pip freeze will detect the correct git link version)

0.16.0

Features

- Add `agent.docker_init_bash_script` configuration section to allow finer control over docker startup script
- Changed default docker image from `nvidia/cuda` to `nvidia/cuda:10.1-runtime-ubuntu18.04` to support `cudnn` frameworks (e.g. TF)
- Improve support for dockers with preinstalled `conda` environment
- Improve trains-agent-docker spinning
- Add `daemon --order-fairness` for round-robin queue pulling
- Add `daemon --stop` to terminate a running agent (assuming other arguments are the same)
- If no additional arguments, Agents are terminated in lexicographical order
- Support cleanup of all log files on termination unless executed with `--debug`
- Add error message when Trains API Server is not accessible on startup

Bug Fixes

- Fix GPU Windows monitoring support https://github.com/allegroai/trains/issues/177
- Fix `.git-credentials` and `.gitconfig` mapping into docker
- Fix non-root docker image usage
- Fix docker to use `UTF-8` encoding, so prints won't break it
- Fix `--debug` to set all loggers to `DEBUG`
- Fix task status change to `queued` should never happen during Task runtime
- Fix `requirement_parser` to support `package git+http` lines
- Fix GIT user/password in requirements and support for `-e git+http` lines
- Fix configuration wizard to generate `trains.conf` matching latest Trains definitions

0.15.1

Features

- Add Trains Agent Daemon and Services docker files

Bug Fixes

- Fix initialization wizard (allow at most two verification retries, then print error)
- Add warning on `--gpus` with no detected CUDA version 24
- Add `agent.force_git_ssh_protocol` configuration option to force all git links to `ssh://` 16
- Add git user/pass permission into pip package installation from Git repository 22

0.15.0

Features

- Add daemon Services Mode (`daemon --services-mode`) where the daemon spins a task in its own docker and verifies start-up and shut-down. This allows multiple tasks to be launched simultaneously on the same machine (currently in CPU mode only), where each task service will register itself as a worker for the lifetime of the task
- Enhance `build --docker` mode
- Add `--install-globally` option to install required packages in the docker's system python
- Add `--entry-point` option to allow automatic task cloning when running the docker
- Support PyTorch Nightly builds using the `agent.torch_nightly` configuration flag. If `true`, the agent looks for a nightly build when a stable torch wheel is not found
- Add environment variables support for git user/password
- Using `TRAINS_AGENT_GIT_USER`/`TRAINS_AGENT_GIT_PASS`
- Pass git credentials to dockerized experiment execution
- Support running code from module (i.e. `-m` in execution entry point)
- Add daemon `--create-queue` to automatically create a queue and use it if queue name doesn't exist in the server
- Move `--gpus` and `--cpu-only` to worker args (used by daemon, execute and build)

Bug Fixes

- Fix init wizard, correctly display the input servers 19
- Fix version control links in requirements when using `conda`
- Fix `build --docker` mode standalone docker execution
- Improve docker host-mount support, use `TRAINS_AGENT_DOCKER_HOST_MOUNT` environment variable
- Support `pip` v20.1 local/http package reference in `pip freeze`
- Fix detached mode to correctly use cache folder slots
- Fix `CUDA_VISIBLE_DEVICES` should never be set to "all" (Trains Slack channel [thread](https://allegroai-trains.slack.com/archives/CTK20V944/p1590563064305700))
- Do not monitor GPU when running with `--cpu-only`

Page 1 of 3

Releases

Has known vulnerabilities

Trains-agent

Page 1 of 3

0.16.3

0.16.2

0.16.1

0.16.0

0.15.1

0.15.0

Page 1 of 3

Links

Releases