Radical.pilot

Latest version: v1.60.0

Safety actively analyzes 629004 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 29

1.41.0

- fix RTD
- replace MongoDB with ZMQ messaging
- adapt resource config for `ccs.mahti` to the new structure
- add description about input staging data
- add method to track startup file with service URL (special case - SOMA)
- add package `mpich` into CU and docs dependencies
- add resource_description class
- check agent sandbox existence
- clean RPC handling
- clean raptor RPC
- deprecated `python.system_packages`
- enable testing of all notebooks
- enable tests for all devel-branches
- fix heartbeat management
- fix LM config initialization
- fix RM LSF for Lassen (+ add platform config)
- fix Session close options
- fix TMGR Staging Input
- fix `pilot_state` in bootstrapping
- fix `task_pre_exec` configurable parameter for Popen
- fix bootstrapping for sub-agents
- keep pilot RPCs local
- raptor worker: one profile per rank
- let raptor use registry
- shield agains missing mpi
- sub-schema for `schemas`
- switch to registry configs instead of config files
- update testes
- update handling of the service startup process
- upload session when testing notebooks
- use hb msg class type
- version RP devel/nodb temporary


--------------------------------------------------------------------------------

1.37.0

- fix `default_remote_workdir` for `csc.mahti` platform
- add README to description for pypi
- link config tutorial
- add raptor to API docs
- add MPI flavor `MPI_FLAVOR_PALS`
- add cpu-binding for LM MPIEXEC with the `MPI_FLAVOR_PALS` flavor
- clean up Polaris config
- fix raptor master hb_freq and hb_timeout
- fix test for MPIRUN LM
- fix tests for MPIEXEC LM
- add csc.mahti resource config
- add slurm inspection test


--------------------------------------------------------------------------------

1.36.0

- added pre-defined `pre_exec` for Summit (preserve `LD_LIBRARY_PATH` from LM)
- fixed GPU discovery from SLURM env variables
- increase raptor's heartbeat time


--------------------------------------------------------------------------------

1.35.0

- Improve links to resource definitions.
- Improve typing in Session.get_pilot_managers
- Provide a target for Sphinx `:py:mod:` role.
- Un-hide "Utilities and helpers" section in API reference.
- Use a universal and unique identifier for registered callbacks.
- added option `--exact` for Rivanna (SRun LM)
- fixes tests for PRs from forks (2969)


--------------------------------------------------------------------------------

1.34.0

- major documentation overhaul
- Fixes ticket 1577
- Fixes ticket 2553
- added tests for PilotManager methods (`cancel_pilots`, `kill_pilots`)
- fixed configuration for Perlmutter
- fixed env dumping for RP Agent
- move timeout into `kill_pilots` method to delay forced termination
- re-introduce a `use_mpi` flag


--------------------------------------------------------------------------------

1.33.0

- add a resource definition for rivanna at UVa.
- add documentation for missing properties
- add an exception for RAPTOR workers regarding GPU sharing
- add an exception in case GPU sharing is used in SRun or MPIRun LMs
- add configuration discovery for `gpus_per_node` (Slurm)
- add `PMI_ID` env variable (related to Hydra)
- add rank env variable for MPIExec LM
- add resource config for FrontierOLCF
- add service task description verification
- add interactive config to UVA
- add raptor tasks to the API doc
- add rank documentation
- allow access to full node memory by default
- changed type for `task['resources']`, let RADICAL-Analytics to
handle it
- changed type of `gpus_per_rank` attribute in `TaskDescription` (from
`int` to `float`)
- enforce correct task mode for raptor master/workers
- ensure result_cb for executable tasks
- ensure `session._get_task_sandbox` for raptor tasks
- ensure that `wait_workers` raises RuntimeError during stop
- ensure worker termination on raptor shutdown
- fix CUDA env variable(s) setup for `pre_exec` (in POPEN executor)
- fix `gpu_map` in Scheduler and its usage
- fix ranks calculation
- fix slots estimation process
- fix tasks binding (e.g., bind task to a certain number of cores)
- fix the process of requesting a correct number of cores/gpus (in
case of blocked cores/gpus)
- Fix path of task sandbox path
- fix wait_workers
- google style docstrings.
- use parameter `new_session_per_task` within resource description to
control input parameter `start_new_session` in `subprocess.Popen`
- keep virtualenv as fallback if venv is missing
- let SRun LM to get info about GPUs from configured slots
- make slot dumps dependent on debug level
- master rpc handles stop request
- move from custom virtualenv version to `venv` module
- MPI worker sync
- Reading resources from created task description
- reconcile different worker submission paths
- recover bootstrap_0\_stop event
- recover task description dump for raptor
- removed codecov from test requirements (codecov is represented by
GitHub actions)
- removed `gpus_per_node` - let SAGA handle GPUs
- removed obsolete configs (FUNCS leftover)
- re-order worker initialization steps, time out on registration
- support sandboxes for raptor tasks
- sync JSRun LM options according to defined slots
- update JSRun LM according to GPU sharing
- update slots estimation and `core/gpu_map` creation
- worker state update cb


Past

Use past releases to reproduce an earlier experiments.

--------------------------------------------------------------------------------

Page 3 of 29

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.