Ray

Latest version: v2.22.0

Safety actively analyzes 630052 Python packages for vulnerabilities to keep your Python projects secure.

Page 8 of 15

1.4.0

Not secure

Ray Autoscaler

🎉 New Features:
* Support Helm Chart for deploying Ray on Kubernetes
* Key Autoscaler metrics are now exported via Prometheus!

💫Enhancements
* Better error messages when a node fails to come online

🔨 Fixes:
* Stability and interface fixes for Kubernetes deployments.
* Fixes to Azure NodeProvider

Ray Client

🎉 New Features:
* Complete API parity with non-client mode
* Experimental ClientBuilder API (docs here)
* Full Asyncio support

💫Enhancements
* Keep Alive for Messages for long lived connections
* Improved pickling error messages

🔨 Fixes:
* Client Disconnect can be called multiple times
* Client Reference Equality Check
* Many bug fixes and tests for the complete ray API!

Ray Core

🎉 New Features:
* Namespaces ([check out the docs](https://docs.ray.io/en/master/namespaces.html))! Note: this may be a breaking change if you’re using detached actors (set ray.init(namespace=””) for backwards compatible behavior).

🔨 Fixes:
* Support increment by arbitrary number with ray.util.metrics.Counter
* Various bug fixes for the placement group APIs including the GPU assignment bug (15049).

🏗 Architecture refactoring:
* Increase the efficiency and robustness of resource reporting

Ray Data Processing

🔨 Fixes:
* Various bug fixes for better stability (16063, 14821, 15669, 15757, 15431, 15426, 15034, 15071, 15070, 15008, 15955)
* Fixed a critical bug where the driver uses excessive memory usage when there are many objects in the cluster (14322).
* Dask on Ray and Modin can now be run with Ray client

🏗 Architecture refactoring:
* Ray 100TB shuffle results: https://github.com/ray-project/ray/issues/15770
* More robust memory management subsystem is in progress (15157, 15027)

RLlib

🎉 New Features:
* PyTorch multi-GPU support (14709, 15492, 15421).
* CQL TensorFlow support (15841).
* Task-settable Env/Curriculum Learning API (15740).
* Support for native tf.keras Models (no ModelV2 required) (14684, 15273).
* Trainer.train() and Trainer.evaluate() can run in parallel (optional) (15040, 15345).

💫Enhancements and documentation:
* CQL: Bug fixes and confirmed MuJoCo benchmarks (15814, 15603, 15761).
* Example for differentiable neural computer (DNC) network (14844, 15939).
* Added support for int-Box action spaces. (15012)
* DDPG/TD3/A[23]C/MARWIL/BC: Code cleanup and type annotations. (14707).
* Example script for restoring 1 agent out of n
* Examples for fractional GPU usage. (15334)
* Enhanced documentation page describing example scripts and blog posts (15763).
* Various enhancements/test coverage improvements: 15499, 15454, 15335, 14865, 15525, 15290, 15611, 14801, 14903, 15735, 15631,

🔨 Fixes:
* Memory Leak in multi-agent environment (15815). Shoutout to Bam4d!
* DDPG PyTorch GPU bug. (16133)
* Simple optimizer should not be used by default for tf+MA (15365)
* Various bug fixes: 15762, 14843, 15042, 15427, 15871, 15132, 14840, 14386, 15014, 14737, 15015, 15733, 15737, 15736, 15898, 16118, 15020, 15218, 15451, 15538, 15610, 15326, 15295, 15762, 15436, 15558, 15937

🏗 Architecture refactoring:
* Remove atari dependency (15292).
* `Trainer._evaluate()` renamed to `Trainer.evaluate()` (backward compatible); `Trainer.evaluate()` can be called even w/o evaluation worker set, if `create_env_on_driver=True` (15591).

Tune

🎉 New Features:
* ASHA scheduler now supports save/restore. (15438)
* Add HEBO to search algorithm shim function (15468)
* Add SkoptSearcher/Bayesopt Searcher restore functionality (15075)

💫Enhancements:
* We now document scalability best practices (k8s, scalability thresholds). You can [find this here](https://docs.ray.io/en/master/tune/api_docs/scalability.html) (#14566)
* You can now set the result buffer_length via tune.run - this helps with trials that report too frequently. (15810)
* Support numpy types in TBXlogger (15760)
* Add `max_concurrent` option to BasicVariantGenerator (15680)
* Add `seed` parameter to OptunaSearch (15248)
* Improve BOHB/ConfigSpace dependency check (15064)

🔨Fixes:
* Reduce default number of maximum pending trials to max(16, cluster_cpus) (15628)
* Return normalized checkpoint path (15296)
* Escape paths before globbing in TrainableUtil.get_checkpoints_paths (15368)
* Optuna Searcher: Set correct Optuna TrialState on trial complete (15283)
* Fix type annotation in tune.choice (15038)
* Avoid system exit error by using `del` when cleaning up actors (15687)

Serve

🎉 New Features:
* As of Ray 1.4, Serve has a new API centered around the concept of “Deployments.” Deployments offer a more streamlined API and can be declaratively updated, which should improve both development and production workflows. The existing APIs have not changed from Ray 1.4 and will continue to work until Ray 1.5, at which point they will be removed (see the package reference if you’re not sure about a specific API). Please see the [migration guide](https://docs.google.com/document/d/1Tgm-bHz6au0B8F_Ps0SLPXh9oyw8pIaGWKWunnK-Kuw/edit#) for details on how to update your existing Serve application to use this new API.
* New `serve.deployment` API: `serve.deployment, serve.get_deployments, serve.list_deployments` (14935, 15172, 15124, 15121, 14953, 15152, 15821)
* New `serve.ingress(fastapi_app)` API (15445, 15441, 14858)
* New `serve.batch` decorator in favor of legacy `max_batch_size` in backend config (15065)
* `serve.start()` is now idempotent (15148)
* Added support for `handle.method_name.remote()` (14831)

🔨Fixes:
* Rolling updates for redeployments (14803)
* Latency improvement by using pickle (15945)
* Controller and HTTP proxy uses `num_cpus=0` by default (15000)
* Health checking in the controller instead of using `max_restarts` (15047)
* Use longest prefix matching for path routing (15041)

Dashboard

🎉New Features:
* Experimental [OpenTelemetry support](https://docs.ray.io/en/master/ray-tracing.html). (#16028,14872,15742).

🔨Fixes:
* Add object store memory column (15697)
* Add object store stats to dashboard API. (15677)
* Remove disk data from the dashboard when running on K8s. (14676)
* Fix reported dashboard ip when using 0.0.0.0 (15506)

Thanks
Many thanks to all those who contributed to this release!

clay4444, Fabien-Couthouis, mGalarnyk, smorad, ckw017, ericl, antoine-galataud, pleiadesian, DmitriGekhtman, robertnishihara, Bam4d, fyrestone, stephanie-wang, kfstorm, wuisawesome, rkooo567, franklsf95, micahtyong, WangTaoTheTonic, krfricke, hegdeashwin, devin-petersohn, qicosmos, edoakes, llan-ml, ijrsvt, richardliaw, Sertingolix, ffbin, simjay, AmeerHajAli, simon-mo, tom-doerr, sven1977, clarkzinzow, mxz96102, SebastianBo1995, amogkam, iycheng, sumanthratna, Catch-Bull, pcmoritz, architkulkarni, stefanbschneider, tgaddair, xcharleslin, cthoyt, fcardoso75, Jeffwan, mvindiola1, michaelzhiluo, rlan, mwtian, SongGuyang, YeahNew, kathryn-zhou, rfali, jennakwon06, Yeachan-Heo

1.3.0

Not secure

Highlights
* We are now testing and publishing Ray's scalability limits with each release, see: https://github.com/ray-project/ray/tree/releases/1.3.0/benchmarks
* Ray Client is now usable by default with any Ray cluster started by the Ray Cluster Launcher.

Ray Cluster Launcher

💫Enhancements:
* Observability improvements (14816, 14608)
* Worker nodes no longer killed on autoscaler failure (14424)
* Better validation for min_workers and max_workers (13779)
* Auto detect memory resource for AWS and K8s (14567)
* On autoscaler failure, propagate error message to drivers (14219)
* Avoid launching GPU nodes when the workload only has CPU tasks (13776)
* Autoscaler/GCS compatibility (13970, 14046, 14050)
* Testing (14488, 14713)
* Migration of configs to multi-node-type format (13814, 14239)
* Better config validation (14244, 13779)
* Node-type max workers defaults infinity (14201)

🔨 Fixes:
* AWS configuration (14868, 13558, 14083, 13808)
* GCP configuration (14364, 14417)
* Azure configuration (14787, 14750, 14721)
* Kubernetes (14712, 13920, 13720, 14773, 13756, 14567, 13705, 14024, 14499, 14593, 14655)
* Other (14112, 14579, 14002, 13836, 14261, 14286, 14424, 13727, 13966, 14293, 14293, 14718, 14380, 14234, 14484)

Ray Client

💫Enhancements:
* Version checks for Python and client protocol (13722, 13846, 13886, 13926, 14295)
* Validate server port number (14815)
* Enable Ray client server by default (13350, 13429, 13442)
* Disconnect ray upon client deactivation (13919)
* Convert Ray objects to Ray client objects (13639)
* Testing (14617, 14813, 13016, 13961, 14163, 14248, 14630, 14756, 14786)
* Documentation (14422, 14265)

🔨 Fixes:

* Hook runtime context (13750)
* Fix mutual recursion (14122)
* Set gRPC max message size (14063)
* Monitor stream errors (13386)
* Fix dependencies (14654)
* Fix `ray.get` ctrl-c (14425)
* Report error deserialization errors (13749)
* Named actor refcounting fix (14753)
* RayTaskError serialization (14698)
* Multithreading fixes (14701)

Ray Core

🎉 New Features:
* We are now testing and publishing Ray's scalability limits with each release. Check out https://github.com/ray-project/ray/tree/releases/1.3.0/benchmarks.
* [alpha] Ray-native Python-based collective communication primitives for Ray clusters with distributed CPUs or GPUs.

🔨 Fixes:
* Ray is now using c++14.
* Fixed high CPU breaking raylets with heartbeat missing errors (13963, 14301)
* Fixed high CPU issues from raylet during object transfer (13724)
* Improvement in placement group APIs including better Java support (13821, 13858, 13582, 15049, 13821)

Ray Data Processing

🎉 New Features:
* Object spilling is turned on by default. Check out the [documentation](https://docs.ray.io/en/master/memory-management.html#object-spilling).
* Dask-on-Ray and Spark-on-Ray are fully ready to use. Please [try them out](https://docs.ray.io/en/master/raydp.html) and give us feedback!
* Dask-on-Ray is now compatible with Dask 2021.4.0.
* Dask-on-Ray now works natively with [`dask.persist()`](https://docs.dask.org/en/latest/api.html#dask.persist).

🔨 Fixes:
* Various improvements in object spilling and memory management layer to support large scale data processing (13649, 14149, 13853, 13729, 14222, 13781, 13737, 14288, 14578, 15027)
* `lru_evict` flag is now deprecated. Recommended solution now is to use object spilling.

🏗 Architecture refactoring:
* Various architectural improvements in object spilling and memory management. For more details, check out the [whitepaper](https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/edit#heading=h.61xcnifjkb6v).
* Locality-aware scheduling is turned on by default.
* Moved from centralized GCS-based object directory protocol to decentralized owner-to-owner protocol, yielding better cluster scalability.

RLlib

🎉 New Features:
* R2D2 implementation for torch and tf. (13933)
* PlacementGroup support (all RLlib algos now return PlacementGroupFactory from Trainer.default_resource_request). (14289)
* Multi-GPU support for tf-DQN/PG/A2C. (13393)

💫Enhancements:
* Documentation: Update documentation for Curiosity's support of continuous actions (13784); CQL documentation (14531)
* Attention-wrapper works with images and supports prev-n-actions/rewards options. (14569)
* `rllib rollout` runs in parallel by default via Trainer’s evaluation worker set. (14208)
* Add env rendering (customizable) and video recording options (for non-local mode; >0 workers; +evaluation-workers) and episode media logging. (14767, 14796)
* Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (13522)
* Example Scripts: Add coin game env + matrix social dilemma env + tests and examples (shoutout to Maxime Riché!). (14208); Attention net (14864); Serve + RLlib. (14416); Env seed (14471); Trajectory view API (enhancements and tf2 support). (13786); Tune trial + checkpoint selection. (14209)
* DDPG: Add support for simplex action space. (14011)
* Others: `on_learn_on_batch` callback allows custom metrics. (13584); Add `TorchPolicy.export_model()`. (13989)

🔨 Fixes:
* Trajectory View API bugs (13646, 14765, 14037, 14036, 14031, 13555)
* Test cases (14620, 14450, 14384, 13835, 14357, 14243)
* Others (13013, 14569, 13733, 13556, 13988, 14737, 14838, 15272, 13681, 13764, 13519, 14038, 14033, 14034, 14308, 14243)

🏗 Architecture refactoring:
* Remove all non-trajectory view API code. (14860)
* Obsolete UsageTrackingDict in favor of SampleBatch. (13065)

Tune

🎉 New Features:
* We added a new searcher `HEBOSearcher` (14504, 14246, 13863, 14427)
* Tune is now natively compatible with the Ray Client (13778, 14115, 14280)
* Tune now uses Ray’s Placement Groups underneath the hood. This will enable much faster autoscaling and training (for distributed trials) (13906, 15011, 14313)

💫Enhancements:
* Checkpointing improvements (13376, 13767)
* Optuna Search Algorithm improvements (14731, 14387)
* tune.with_parameters now works with Class API (14532)

🔨Fixes:
* BOHB & Hyperband fixes (14487, 14171)
* Nested metrics improvements (14189, 14375, 14379)
* Fix non-deterministic category sampling (13710)
* Type hints (13684)
* Documentation (14468, 13880, 13740)
* Various issues and bug fixes (14176, 13939, 14392, 13812, 14781, 14150, 14850, 14118, 14388, 14152, 13825, 13936)

SGD
* Add fault tolerance during worker startup (14724)

Serve

🎉 New Features:
* Added metadata to default logger in backend replicas (14251)
* Added more metrics for ServeHandle stats (13640)
* Deprecated system-level batching in favor of serve.batch (14610, 14648)
* Beta support for Serve with Ray client (14163)
* Use placement groups to bypass autoscaler throttling (13844)
* Deprecate client-based API in favor of process-wide singleton (14696)
* Add initial support for FastAPI ingress (14754)

🔨 Fixes:
* Fix ServeHandle serialization (13695)

🏗 Architecture refactoring:
* Refactor BackendState to support backend versioning and add more unit testing (13870, 14658, 14740, 14748)
* Optimize long polling to be per-key (14335)

Dashboard

🎉 New Features:
* Dashboard now supports being served behind a reverse proxy. (14012)
* Disk and network metrics are added to prometheus. (14144)

💫Enhancements:
* Better CPU & memory information on K8s. (14593, 14499)
* Progress towards a new scalable dashboard. (13790, 11667, 13763,14333)

Thanks

Many thanks to all those who contributed to this release:
geraint0923, iycheng, yurirocha15, brian-yu, harryge00, ijrsvt, wumuzi520, suquark, simon-mo, clarkzinzow, RaphaelCS, FarzanT, ob, ashione, ffbin, robertnishihara, SongGuyang, zhe-thoughts, rkooo567, Ezra-H, acxz, clay4444, QuantumMecha, jirkafajfr, wuisawesome, Qstar, guykhazma, devin-petersohn, jeroenboeye, ConeyLiu, dependabot[bot], fyrestone, micahtyong, javi-redondo, Manuscrit, mxz96102, EscapeReality846089495, WangTaoTheTonic, stanislav-chekmenev, architkulkarni, Yard1, tchordia, zhisbug, Bam4d, niole, yiranwang52, thomasjpfan, DmitriGekhtman, gabrieleoliaro, jparkerholder, kfstorm, andrew-rosenfeld-ts, erikerlandson, Crissman, raulchen, sumanthratna, Catch-Bull, chaokunyang, krfricke, raoul-khour-ts, sven1977, kathryn-zhou, AmeerHajAli, jovany-wang, amogkam, antoine-galataud, tgaddair, randxie, ChaceAshcraft, ericl, cassidylaidlaw, TanjaBayer, lixin-wei, lena-kashtelyan, cathrinS, qicosmos, richardliaw, rmsander, jCrompton, mjschock, pdames, barakmich, michaelzhiluo, stephanie-wang, edoakes

1.2.0

Not secure

Highlights

* Ray client is now in beta! Check out more details here: https://docs.ray.io/en/master/ray-client.html
XGBoost-Ray is now in beta! Check out more details about this project at https://github.com/ray-project/xgboost_ray.
* Check out the Serve migration guide: https://docs.google.com/document/d/1CG4y5WTTc4G_MRQGyjnb_eZ7GK3G9dUX6TNLKLnKRAc/edit
* Ray’s C++ support is now in beta: https://docs.ray.io/en/master/#getting-started-with-ray
* An alpha version of object spilling is now available: https://docs.ray.io/en/master/memory-management.html#object-spilling

Ray Autoscaler

🎉 New Features:

* A new autoscaler output format in monitor.log (12772, 13561)
* Piping autoscaler events to driver logs (13434)

💫Enhancements
* Full support of ray.autoscaler.sdk.request_resources() API (https://docs.ray.io/en/master/cluster/autoscaling.html?highlight=request_resources#ray.autoscaler.sdk.request_resources) .
* Make placement groups bypass max launch limit (13089)
* [K8s] Retry getting home directory in command runner. (12925)
* [docker] Pull if image is not present (13136)
* [Autoscaler] Ensure ubuntu is owner of docker host mount folder (13579)
🔨 Fixes:
* Many autoscaler bug fixes (12952, 12689, 13058, 13671, 13637, 13588, 13505, 13154, 13151, 13138, 13008, 12980, 12918, 12829, 12714, 12661, 13567, 13663, 13623, 13437, 13498, 13472, 13392, 12514, 13325, 13161, 13129, 12987, 13410, 12942, 12868, 12866, 12865, 12098, 12609)

RLLib
🎉 New Features:
* Fast Attention Nets (using the trajectory view API) (12753).
* Attention Nets: Full PyTorch support (12029).
* Attention Nets: Support auto-wrapping around default- or custom models by specifying “use_attention=True” in the model’s config. * * * This works completely analogously now to “use_lstm=True”. (11698)
* New Offline RL Algorithm: CQL (based on SAC) (13118).
* MAML: Discrete actions support (added CartPole mass test case).
* Support Atari framestacking via the trajectory view API (13315).
* Support for D4RL environments/benchmarks (13550).
* Preliminary work on JAX support (13077, 13091).

💫 Enhancements:
* Rollout lengths: Allow unit to be configured as “agent_steps” in multi-agent settings (default: “env_steps”) (12420).
* TFModelV2: Soft-deprecate register_variables and unify var names wrt TorchModelV2 (13339, 13363).

📖 Documentation:
* Added documentation on Model building API (13260, 13261).
* Added documentation for the trajectory view API. (12718)
* Added documentation for SlateQ (13266).
* Readme.md documentation for almost all algorithms in rllib/agents (12943, 13035).
* Type annotations for the “rllib/execution” folder (12760, 13036).

🔨 Fixes:
* MARWIL and BC: Add grad-clipping config option to stabilize learning (13455).
* A3C: Solve PyTorch- and TF-eager async race condition between calling model and its value function (13467).
* Various issues- and bug fixes (12619, 12682, 12704, 12706, 12708, 12765, 12786, 12787, 12793, 12832, 12844, 12846, 12915, 12941, 13039, 13040, 13064, 13083, 13121, 13126, 13237, 13238, 13308, 13332, 13397, 13459, 13553).
🏗 Architecture refactoring:
* Env directory has been cleaned up and is now divided in: Core part (rllib/env) with all basic env classes, and rllib/env/wrappers containing third-party wrapper classes (Atari, Unity3D, etc..) (13082).

Tune

🎉 New Features:

* Ray Tune has updated and improved its integration with MLflow. See [this blog post for details](https://medium.com/distributed-computing-with-ray/ray-mlflow-taking-distributed-machine-learning-applications-to-production-103f5505cb88) (#12840, 13301, 13533)

💫 Enhancements

* Ray Tune now uses ray.cloudpickle underneath the hood, allowing you to checkpoint large models (>4GB) (12958).
* Using the 'reuse_actors' flag can now speed up training for general Trainable API usage. (13549)
* Ray Tune will now automatically buffer results from trainables, allowing you to use an arbitrary reporting frequency on your training functions. (13236)
* Ray Tune now has a variety of experiment stoppers (12750)
* Ray Tune now supports an integer loguniform search space distribution (12994)
* Ray Tune now has an initial support for the Ray placement group API. (13370)
* The Weights and Bias integration (`WandbLogger`) now also accepts wandb.data_types.Video (13169)
* The Hyperopt integration (`HyperoptSearch`) can now directly accept category variables instead of indices (12715)
* Ray Tune now supports experiment checkpointing when using grid search (13357)

🔨Fixes and Updates

* The Optuna integration was updated to support the 2.4.0 API while maintaining backwards compatibility (13631)
* All search algorithms now support `points_to_evaluate` (12790, 12916)
* PBT Transformers example was updated and improved (13174, 13131)
* The scikit-optimize integration was improved (12970)
* Various bug fixes (13423, 12785, 13171, 12877, 13255, 13355)

SGD

🔨Fixes and Updates

* Fix Docstring for `as_trainable` (13173)
* Fix process group timeout units (12477)
* Disable Elastic Training by default when using with Tune (12927)

Serve

🎉 New Features:
* Ray Serve backends now accept a Starlette request object instead of a Flask request object (12852). This is a breaking change, so please read the migration guide.
* Ray Serve backends now have the option of returning a Starlette Response object (12811, 13328). This allows for more customizable responses, including responses with custom status codes.
* [Experimental] The new Ray Serve MLflow plugin makes it easy to deploy your MLflow models on Ray Serve. It comes with a Python API and a command-line interface.
* Using “ImportedBackend” you can now specify a backend based on a class that is installed in the Python environment that the workers will run in, even if the Python environment of the driver script (the one making the Serve API calls) doesn’t have it installed (12923).

💫 Enhancements:
* Dependency management using conda no longer requires the driver script to be running in an activated conda environment (13269).
* Ray ObjectRef can now be used as argument to `serve_handle.remote(...)`. (12592)
* Backends are now shut down gracefully. You can set the graceful timeout in BackendConfig. (13028)

📖 Documentation:
* A tutorial page has been added for integrating Ray Serve with your existing FastAPI web server or with your existing AIOHTTP web server (13127).
* Documentation has been added for Ray Serve metrics (13096).

1.1.0

Not secure

Ray Core
🎉 New Features:
- Progress towards supporting a Ray client
- Descendent tasks are cancelled when the calling task is cancelled
🔨 Fixes:
- Improved object broadcast robustness
- Improved placement group support
🏗 Architecture refactoring:
- Progress towards the new scheduler backend

RLlib
🎉 New Features:
- SUMO simulator integration (rllib/examples/simulators/sumo/). Huge thanks to Lara Codeca! (11710)
- SlateQ Algorithm added for PyTorch. Huge thanks to Henry Chen! (11450)
- MAML extension for all Models, except recurrent ones. (11337)
- Curiosity Exploration Module for tf1.x/2.x/eager. (11945)
- Minimal JAXModelV2 example. (12502)
🔨 Fixes:
- Fix RNN learning for tf2.x/eager. (11720)
- LSTM prev-action/prev-reward settable separately and prev-actions are now one-hot’d. (12397)
- PyTorch LR schedule not working. (12396)
- Various PyTorch GPU bug fixes. (11609)
- SAC loss not using prio. replay weights in critic’s loss term. (12394)
- Fix epsilon-greedy Exploration for nested action spaces. (11453)
🏗 Architecture refactoring:
- Trajectory View API on by default (faster PG-type algos by ~20% (e.g. PPO on Atari)). (11717, 11826, 11747, and 11827)

Tune
🎉 New Features:
- Loggers can now be passed as objects to tune.run. The new ExperimentLogger abstraction was introduced for all loggers, making it much easier to configure logging behavior. (11984, 11746, 11748, 11749)
- The tune verbosity was refactored into four levels: 0: Silent, 1: Only experiment-level logs, 2: General trial-level logs, 3: Detailed trial-level logs (default) (11767, 12132, 12571)
- Docker and Kubernetes autoscaling environments are detected automatically, automatically utilizing the correct checkpoint/log syncing tools (12108)
- Trainables can now easily leverage Tensorflow DistributedStrategy! (11876)

💫 Enhancements
- Introduced a new serialization debugging utility (12142)
- Added a new lightweight Pytorch-lightning example (11497, 11585)
- The BOHB search algorithm can be seeded with a random state (12160)
- The default anonymous metrics can be used automatically if a `mode` is set in tune.run (12159).
- Added HDFS as Cloud Sync Client (11524)
- Added xgboost_ray integration (12572)
- Tune search spaces can now be passed to search algorithms on initialization, not only via tune.run (11503)
- Refactored and added examples (11931)
- Callable accepted for register_env (12618)
- Tune search algorithms can handle/ignore infinite and NaN numbers (11835)
- Improved scalability for experiment checkpointing (12064)
- Nevergrad now supports points_to_evaluate (12207)
- Placement group support for distributed training (11934)

🔨 Fixes:
- Fixed with_parameters behavior to avoid serializing large data in scope (12522)
- TBX logger supports None (12262)
- Better error when `metric` or `mode` unset in search algorithms (11646)
- Better warnings/exceptions for fail_fast='raise' (11842)
- Removed some bottlenecks in trialrunner (12476)
- Fix file descriptor leak by syncer and Tensorboard (12590, 12425)
- Fixed validation for search metrics (11583)
- Fixed hyperopt randint limits (11946)

Serve
🎉 New Features:
- You can start backends in different conda environments! See more in the [dependency management doc](https://docs.ray.io/en/master/serve/advanced.html#dependency-management). (11743)
- You can add a optional `reconfigure` method to your Servable to allow [reconfiguring](https://docs.ray.io/en/master/serve/advanced.html#reconfiguring-backends-experimental) backend replicas at runtime. (11709)
🔨Fixes:
- Set serve.start(http_host=None) to disable HTTP servers. If you are only using ServeHandle, this option lowers resource usage. (11627)
- Flask requests will no longer create reference cycles. This means peak memory usage should be lower for high traffic scenarios. (12560)
🏗 Architecture refactoring:
- Progress towards a goal state driven Serve controller. (12369,11792,12211,12275,11533,11822,11579,12281)
- Progress towards faster and more efficient ServeHandles. (11905, 12019, 12093)

Ray Cluster Launcher (Autoscaler)
🎉 New Features:
- A new Kubernetes operator: https://docs.ray.io/en/master/cluster/k8s-operator.html
💫 Enhancements
- Containers do not run with root user as the default (11407)
- SHM-Size is auto-populated when using the containers (11953)
🔨 Fixes:
- Many autoscaler bug fixes (11677, 12222, 11458, 11896, 12123, 11820, 12513, 11714, 12512, 11758, 11615, 12106, 11961, 11674, 12028, 12020, 12316, 11802, 12131, 11543, 11517, 11777, 11810, 11751, 12465, 11422)

SGD
🎉 New Features:
- Easily customize your torch.DistributedDataParallel configurations by passing in a `ddp_args` field into `TrainingOperator.register` (11771).
🔨 Fixes:
- `TorchTrainer` now properly scales up to more workers if more resources become available (12562)
📖 Documentation:
- The new callback API for using Ray SGD with Tune is now documented (11479)
- Pytorch Lightning + Ray SGD integration is now documented (12440)

Dashboard
🔨 Fixes:
- Fixed bug that prevented viewing the logs for cluster workers
- Fixed bug that caused "Logical View" page to crash when opening a list of actors for a given class.
🏗 Architecture refactoring:
- Dashboard runs on a new backend architecture that is more scalable and well-tested. The dashboard should work on ~100 node clusters now, and we're working on lifting scalability to constraints to support even larger clusters.

Thanks
Many thanks to all those who contributed to this release:
bartbroere, SongGuyang, gramhagen, richardliaw, ConeyLiu, weepingwillowben, zhongchun, ericl, dHannasch, timurlenk07, kaushikb11, krfricke, desktable, bcahlit, rkooo567, amogkam, micahtyong, edoakes, stephanie-wang, clay4444, ffbin, mfitton, barakmich, pcmoritz, AmeerHajAli, DmitriGekhtman, iamhatesz, raulchen, ingambe, allenyin55, sven1977, huyz-git, yutaizhou, suquark, ashione, simon-mo, raoul-khour-ts, Leemoonsoo, maximsmol, alanwguo, kishansagathiya, wuisawesome, acxz, gabrieleoliaro, clarkzinzow, jparkerholder, kingsleykuan, InnovativeInventor, ijrsvt, lasagnaphil, lcodeca, jiajiexiao, heng2j, wumuzi520, mvindiola1, aaronhmiller, robertnishihara, WangTaoTheTonic, chaokunyang, nikitavemuri, kfstorm, roireshef, fyrestone, viotemp1, yncxcw, karstenddwx, hartikainen, sumanthratna, architkulkarni, michaelzhiluo, UWFrankGu, oliverhu, danuo, lixin-wei

1.0.1.post1

Not secure

Patch release containing the following changes:
- https://github.com/ray-project/ray/commit/bcc92f59fdcd837ccc5a560fe37bdf0619075505 Fix dashboard crashing on multi-node clusters.
- https://github.com/ray-project/ray/pull/11600 Add the cluster_name to docker file mounts directory prefix.

1.0.1

Not secure

Highlights

* If you're migrating from Ray < 1.0.0, be sure to check out the [1.0 Migration Guide](https://github.com/ray-project/ray/discussions/11482).
* Autoscaler is now **docker by default**.
* RLLib features multiple new environments.
* Tune supports population based bandits, checkpointing in Docker, and multiple usability improvements.
* SGD supports PyTorch Lightning
* All of Ray's components and libraries have improved performance, scalability, and stability.

Core
* [1.0 Migration Guide](https://github.com/ray-project/ray/discussions/11482).
* Many bug fixes and optimizations in GCS.
* Polishing of the Placement Group API.
* Improved Java language support

RLlib
* Added documentation for Curiosity exploration module (11066).
* Added RecSym environment wrapper (11205).
* Added Kaggle’s football environment (multi-agent) wrapper (11249).
* Multiple bug fixes: GPU related fixes for SAC (11298), MARWIL, all example scripts run on GPU (11105), lifted limitation on 2^31 timesteps (11301), fixed eval workers for ES and ARS (11308), fixed broken no-eager-no-workers mode (10745).
* Support custom MultiAction distributions (11311).
* No environment is created on driver (local worker) if not necessary (11307).
* Added simple SampleCollector class for Trajectory View API (11056).
* Code cleanup: Docstrings and type annotations for Exploration classes (11251), DQN (10710), MB-MPO algorithm, SAC algorithm (10825).

Serve
* API: Serve will error when `serve_client` is serialized. (11181)
* Performance: `serve_client.get_handle("endpoint")` will now get a handle to nearest node, increasing scalability in distributed mode. (11477)
* Doc: Added FAQ page and updated architecture page (10754, 11258)
* Testing: New distributed tests and benchmarks are added (11386)
* Testing: Serve now run on Windows (10682)

SGD
* Pytorch Lightning integration is now supported (11042)
* Support `num_steps` continue training (11142)
* Callback API for SGD+Tune (11316)

Tune
* New Algorithm: Population-based Bandits (11466)
* `tune.with_parameters()`, a wrapper function to pass arbitrary objects through the object store to trainables (11504)
* Strict metric checking - by default, Tune will now error if a result dict does not include the optimization metric as a key. You can disable this with TUNE_DISABLE_STRICT_METRIC_CHECKING (10972)
* Syncing checkpoints between multiple Docker containers on a cluster is now supported with the `DockerSyncer` (11035)
* Added type hints (10806)
* Trials are now dynamically created (instead of created up front) (10802)
* Use `tune.is_session_enabled()` in the Function API to toggle between Tune and non-tune code (10840)
* Support hierarchical search spaces for hyperopt (11431)
* Tune function API now also supports `yield` and `return` statements (10857)
* Tune now supports callbacks with `tune.run(callbacks=...` (11001)
* By default, the experiment directory will be dated (11104)
* Tune now supports `reuse_actors` for function API, which can largely accelerate tuning jobs.

Thanks

We thank all the contributors for their contribution to this release!

acxz, Gekho457, allenyin55, AnesBenmerzoug, michaelzhiluo, SongGuyang, maximsmol, WangTaoTheTonic, Basasuya, sumanthratna, juliusfrost, maxco2, Xuxue1, jparkerholder, AmeerHajAli, raulchen, justinkterry, herve-alanaai, richardliaw, raoul-khour-ts, C-K-Loan, mattearllongshot, robertnishihara, internetcoffeephone, Servon-Lee, clay4444, fangyeqing, krfricke, ffbin, akotlar, rkooo567, chaokunyang, PidgeyBE, kfstorm, barakmich, amogkam, edoakes, ashione, jseppanen, ttumiel, desktable, pcmoritz, ingambe, ConeyLiu, wuisawesome, fyrestone, oliverhu, ericl, weepingwillowben, rkube, alanwguo, architkulkarni, lasagnaphil, rohitrawat, ThomasLecat, stephanie-wang, suquark, ijrsvt, VishDev12, Leemoonsoo, scottwedge, sven1977, yiranwang52, carlos-aguayo, mvindiola1, zhongchun, mfitton, simon-mo

Page 8 of 15

Releases

Has known vulnerabilities

Previous Next

Ray

Page 8 of 15

1.4.0

1.3.0

1.2.0

1.1.0

1.0.1.post1

1.0.1

Page 8 of 15

Links

Releases