Aws-parallelcluster

Latest version: v3.9.1

Safety actively analyzes 630094 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 15

3.4.0

-----

**ENHANCEMENTS**
- Add support for launching nodes across multiple availability zones to increase capacity availability.
- Add support for specifying multiple subnets for each queue to increase capacity availability.
- Add new configuration parameter in `Iam/ResourcePrefix` to specify a prefix for path and name of IAM resources created by ParallelCluster
- Add new configuration section `DeploySettings/LambdaFunctionsVpcConfig` for specifying the Vpc config used by ParallelCluster Lambda Functions.
- Add possibility to specify a custom script to be executed in the head node during the update of the cluster. The script can be specified with `OnNodeUpdated` parameter when using Slurm as scheduler.

**CHANGES**
- Remove creation of EFS mount targets for existing FS.
- Mount EFS file systems using amazon-efs-utils. EFS files systems can be mounted using in-transit encryption and IAM authorized user.
- Install stunnel 5.67 on CentOS7 and Ubuntu to support EFS in-transit encryption.
- Upgrade EFA installer to `1.20.0`
- Efa-driver: `efa-2.1`
- Efa-config: `efa-config-1.11-1`
- Efa-profile: `efa-profile-1.5-1`
- Libfabric-aws: `libfabric-aws-1.16.1`
- Rdma-core: `rdma-core-43.0-2`
- Open MPI: `openmpi40-aws-4.1.4-3`
- Upgrade Slurm to version 22.05.7.

3.3.1

-----

**CHANGES**
- Allow to use official product AMIs even after the two years EC2 deprecation time.
- Increase memory size of ParallelCluster API Lambda to 2048 in order to reduce cold start penalty and avoid timeouts.

**BUG FIXES**
- Prevent managed FSx for Lustre file systems to be replaced during a cluster update avoiding to support changes on the compute fleet subnet id.
- Apply the `DeletionPolicy` defined on shared storages also during the cluster update operations.

3.3.0

-----

**ENHANCEMENTS**
- Add possibility to specify multiple EC2 instance types for the same compute resource.
- Add support for adding and removing shared storages at cluster update by updating `SharedStorage` configuration.
- Add new configuration parameter `DeletionPolicy` for EFS and FSx for Lustre shared storage to support storage retention.
- Add new configuration section `Scheduling/SlurmSettings/Database` to enable accounting functionality in Slurm.
- Add support for On-Demand Capacity Reservations and Capacity Reservations Resource Groups.
- Add new configuration parameter in `Imds/ImdsSettings` to specify the IMDS version to support in a cluster or build image infrastructure.
- Add support for `Networking/PlacementGroup` in the `SlurmQueues/ComputeResources` section.
- Add support for instances with multiple network interfaces that allows only one ENI per device.
- Add support for hp6id instance type as compute nodes.
- Improve validation of networking for external EFS file systems by checking the CIDR block in the attached security group.
- Add validator to check if configured instance types support placement groups.
- Configure NFS threads to be `min(256, max(8, num_cores * 4))` to ensure better stability and performance.
- Move NFS installation at build time to reduce configuration time.
- Enable server-side encryption for the EcrImageBuilder SNS topic created when deploying ParallelCluster API and used to notify on docker image build events.

**CHANGES**
- Change behaviour of `SlurmQueues/Networking/PlacementGroup/Enabled`: now it creates a different managed placement
group for each compute resource instead of a single managed placement group for all compute resources.
- Add support for `PlacementGroup/Name` as the preferred naming method.
- Move head node tags from Launch Template to instance definition to avoid head node replacement on tags updates.
- Disable Multithreading through script executed by cloud-init and not through CpuOptions set into Launch Template.
- Upgrade Python to version 3.9 and NodeJS to version 16 in API infrastructure, API Docker container and cluster Lambda resources.
- Remove support for Python 3.6 in aws-parallelcluster-batch-cli.
- Upgrade Slurm to version 22.05.5.
- Upgrade NVIDIA driver to version 470.141.03.
- Upgrade NVIDIA Fabric Manager to version 470.141.03.
- Upgrade NVIDIA CUDA Toolkit to version 11.7.1.
- Upgrade Python used in ParallelCluster virtualenvs from 3.7.13 to 3.9.15.
- Upgrade Slurm to version 22.05.5.
- Upgrade EFA installer to version 1.18.0.
- Upgrade NICE DCV to version 2022.1-13300.
- Allow for suppressing the `SingleSubnetValidator` for `Queues`.
- Remove usage of prolog/epilog Slurm configuration when `UseEc2Hostnames` is set to `true`.

**BUG FIXES**
- Fix validation of `filters` parameter in `ListClusterLogStreams` command to fail when incorrect filters are passed.
- Fix validation of parameter `SharedStorage/EfsSettings`: now validation fails when `FileSystemId` is specified
along with other `SharedStorage/EfsSettings` parameters, whereas it was previously ignoring them.
- Fix cluster update when changing the order of SharedStorage together with other changes in the configuration.
- Fix `UpdateParallelClusterLambdaRole` in the ParallelCluster API to upload logs to CloudWatch.
- Fix Cinc not using the local CA certificates bundle when installing packages before any cookbooks are executed.
- Fix a hang in upgrading ubuntu via `pcluster build-image` when `Build:UpdateOsPackages:Enabled:true` is set.
- Fix parsing of YAML cluster configuration by failing on duplicate keys.

3.2.1

-----

**ENHANCEMENTS**
- Improve the logic to associate the host routing tables to the different network cards to better support EC2 instances with several NICs.

**CHANGES**
- Upgrade NVIDIA driver to version 470.141.03.
- Upgrade NVIDIA Fabric Manager to version 470.141.03.
- Disable cron job tasks man-db and mlocate, which may have a negative impact on node performance.
- Upgrade Intel MPI Library to 2021.6.0.602.
- Upgrade Python from 3.7.10 to 3.7.13 in response to this [security risk](https://nvd.nist.gov/vuln/detail/CVE-2021-3737).

**BUG FIXES**
- Avoid failing on DescribeCluster when cluster configuration is not available.

3.2.0

Not secure

------

**ENHANCEMENTS**
- Add support for memory-based job scheduling in Slurm
- Configure compute nodes real memory in the Slurm cluster configuration.
- Add new configuration parameter `Scheduling/SlurmSettings/EnableMemoryBasedScheduling` to enable memory-based scheduling in Slurm.
- Add new configuration parameter `Scheduling/SlurmQueues/ComputeResources/SchedulableMemory` to override default value of the memory seen by the scheduler on compute nodes.
- Improve flexibility on cluster configuration updates to avoid the stop and start of the entire cluster whenever possible.
- Add new configuration parameter `Scheduling/SlurmSettings/QueueUpdateStrategy` to set the preferred strategy to adopt for compute nodes needing a configuration update and replacement.
- Improve failover mechanism over available compute resources when hitting insufficient capacity issues with EC2 instances. Disable compute nodes by a configurable amount of time (default 10 min) when a node launch fails due to insufficient capacity.
- Add support to mount existing FSx for ONTAP and FSx for OpenZFS file systems.
- Add support to mount multiple instances of existing EFS, FSx for Lustre / for ONTAP/ for OpenZFS file systems.
- Add support for FSx for Lustre Persistent_2 deployment type when creating a new file system.
- Prompt user to enable EFA for supported instance types when using `pcluster configure` wizard.
- Add support for rebooting compute nodes via Slurm.
- Improved handling of Slurm power states to also account for manual powering down of nodes.
- Add NVIDIA GDRCopy 2.3 into the product AMIs to enable low-latency GPU memory copy.

**CHANGES**
- Upgrade EFA installer to version 1.17.2
- EFA driver: ``efa-1.16.0-1``
- EFA configuration: ``efa-config-1.10-1``
- EFA profile: ``efa-profile-1.5-1``
- Libfabric: ``libfabric-aws-1.16.0~amzn2.0-1``
- RDMA core: ``rdma-core-41.0-2``
- Open MPI: ``openmpi40-aws-4.1.4-2``
- Upgrade NICE DCV to version 2022.0-12760.
- Upgrade NVIDIA driver to version 470.129.06.
- Upgrade NVIDIA Fabric Manager to version 470.129.06.
- Change default EBS volume types from gp2 to gp3 for both the root and additional volumes.
- Changes to FSx for Lustre file systems created by ParallelCluster:
- Change the default deployment type to `Scratch_2`.
- Change the Lustre server version to `2.12`.
- Do not require `PlacementGroup/Enabled` to be set to `true` when passing an existing `PlacementGroup/Id`.
- Add `parallelcluster:cluster-name` tag to all the resources created by ParallelCluster.
- Do not allow setting `PlacementGroup/Id` when `PlacementGroup/Enabled` is explicitly set to `false`.
- Add `lambda:ListTags` and `lambda:UntagResource` to `ParallelClusterUserRole` used by ParallelCluster API stack for cluster update.
- Restrict IPv6 access to IMDS to root and cluster admin users only, when configuration parameter `HeadNode/Imds/Secured` is true as by default.
- With a custom AMI, use the AMI root volume size instead of the ParallelCluster default of 35 GiB. The value can be changed in cluster configuration file.
- Automatic disabling of the compute fleet when the configuration parameter `Scheduling/SlurmQueues/ComputeResources/SpotPrice`
is lower than the minimum required Spot request fulfillment price.
- Show `requested_value` and `current_value` values in the change set when adding or removing a section during an update.
- Disable `aws-ubuntu-eni-helper` service in DLAMI to avoid conflicts with `configure_nw_interface.sh` when configuring instances with multiple network cards.
- Remove support for Python 3.6.
- Set MTU to 9001 for all the network interfaces when configuring instances with multiple network cards.
- Remove the trailing dot when configuring the compute node FQDN.

**BUG FIXES**
- Fix Slurm issue that prevents idle nodes termination.
- Fix the default behavior to skip the ParallelCluster validation and test steps when building a custom AMI.
- Fix file handle leak in `computemgtd`.
- Fix race condition that was sporadically causing launched instances to be immediately terminated because not available yet in EC2 DescribeInstances response
- Fix support for `DisableSimultaneousMultithreading` parameter on instance types with Arm processors.
- Fix ParallelCluster API stack update failure when upgrading from a previus version. Add resource pattern used for the `ListImagePipelineImages` action in the `EcrImageDeletionLambdaRole`.
- Fix ParallelCluster API adding missing permissions needed to import/export from S3 when creating an FSx for Lustre storage.

3.1.5

Not secure

------

**CHANGES**
- Upgrade EFA installer to `1.18.0`
- Efa-driver: `efa-1.16.0-1`
- Efa-config: `efa-config-1.11-1`
- Efa-profile: `efa-profile-1.5-1`
- Libfabric-aws: `libfabric-aws-1.16.0~amzn4.0-1`
- Rdma-core: `rdma-core-41.0-2`
- Open MPI: `openmpi40-aws-4.1.4-2`
- Add `lambda:ListTags` and `lambda:UntagResource` to `ParallelClusterUserRole` used by ParallelCluster API stack for cluster update.
- Upgrade Intel MPI Library to 2021.6.0.602.
- Upgrade NVIDIA driver to version 470.141.03.
- Upgrade NVIDIA Fabric Manager to version 470.141.03.

**BUG FIXES**
- Fix Slurm issue that prevents idle nodes termination.

Page 3 of 15

Releases

Has known vulnerabilities

Previous Next

Aws-parallelcluster

Page 3 of 15

3.4.0

3.3.1

3.3.0

3.2.1

3.2.0

3.1.5

Page 3 of 15

Links

Releases