Torchserve

Latest version: v0.11.0

Safety actively analyzes 630305 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 7

5.1.3

2.0.0

8. Upgraded node to version 18 2663 agunapal

Blogs
+ [High performance Llama 2 deployments with AWS Inferentia2 using TorchServe](https://pytorch.org/blog/high-performance-llama/)
+ [ML Model Server Resource Saving - Transition From High-Cost GPUs to Intel CPUs and oneAPI powered Software with performance](https://pytorch.org/blog/ml-model-server-resource-saving/)
+ [Run multiple generative AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe and save up to 75% in inference costs](https://aws.amazon.com/blogs/machine-learning/run-multiple-generative-ai-models-on-gpu-using-amazon-sagemaker-multi-model-endpoints-with-torchserve-and-save-up-to-75-in-inference-costs/)

New Features
+ Support PyTorch 2.1.0 and Python 3.11 2621 2691 2697 agunapal
+ Supported continous batching for LLM inference 2628 mreso lxning
+ Supported dynamically loading 3rd party package on SageMaker Multi-Model Endpoint 2535 lxning
+ Added DALI handler to handle preprocess and updated Nvidia DALI example 2485 jagadeeshi2i

New Examples
1. Deploy Llama2 on Inferentia2 2458 namannandan
2. [Using TorchServe on SageMaker Inf2.24xlarge with Llama2-13B](https://github.com/aws/amazon-sagemaker-examples-community/blob/main/torchserve/inf2/llama2/llama-2-13b.ipynb) lxning
3. PyTorch tensor parallel on Llama2 example 2623 2689 HamidShojanazeri
4. Enabled better transformer (ie. flash attention 2) on Llama2 2700 HamidShojanazeri lxning
5. Llama2 Chatbot on Mac 2618 agunapal
6. ASR speech recognition example 2047 husenzhang

Improvements
+ Fixed typo in BaseHandler 2547 a-ys
+ Create merge_queue workflow for CI 2548 msaroufim
+ Fixed typo in artifact terminology unification 2551 park12sj
+ Added env hints in model_service_worker 2540 ZachOBrien
+ Refactor conda build scripts to publish all binaries 2561 agunapal
+ Fixed response return type in KServe 2566 jagadeeshi2i
+ Added torchserve-kfs nightly build 2574 jagadeeshi2i
+ Added regression for all CPU binaries 2562 agunapal
+ Updated CICD runners 2586 2597 2636 2627 2677 2710 2696 agunapal msaroufim
+ Upgraded newman version to 5.3.2 2598 2603 agunapal
+ Updated opt benchmark config for inf2 2617 namannandan
+ Added ModelRequestEncoderTest 2580 abergmeier
+ Added manually dispatch workflow 2686 msaroufim
+ Updated test wheels with PyTorch 2.1.0 2684 agunapal
+ Allowed parallel level = 1 to run in torchrun mode 2608 lxning
+ Fixed metric unit assignment backward compatibility 2693 namannandan

Documentation
+ Updated MPS readme 2543 sekyondaMeta
+ Updated large model inference readme 2542 sekyondaMeta
+ Fixed bash snippets in examples/image_classifier/mnist/Docker.md 2345 dmitsf
+ Fixed typo in kubernetes/autoscale.md 2393 CandiedCode
+ Fixed path in examples/image_classifier/resnet_18/README.md 2568 udaij12
+ Model Loading Guidance 2592 agunapal
+ Updated Metrics readme 2560 sekyondaMeta
+ Display nightly workflow status badge in README 2619 2666 agunapal msaroufim
+ Update torch.compile information in examples/pt2/README.md 2706 agunapal
+ [Deploy model using TorchServe on SageMaker tutorial](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-models-frameworks-torchserve.html) lxning

Platform Support
Ubuntu 16.04, Ubuntu 18.04, Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.

GPU Support

1.14.1

1.13.1

0.10.0

+ **Kubernetes HPA support** - Added [support](https://github.com/pytorch/serve/issues/714) for Kubernetes HPA.
+ **Faster transformer example** - Added [example](https://github.com/pytorch/serve/tree/master/examples/FasterTransformer_HuggingFace_Bert) for Faster transformer for optimized transformer model inference.
+ **(experimental) torchprep support** - Added experimental [CLI tool](https://github.com/pytorch/serve/tree/master/experimental/torchprep) to prepare Pytorch models for efficient inference.
+ **Custom metrics example** - Added [example](https://github.com/pytorch/serve/tree/master/examples/custom_metrics) for custom metrics with mtail metrics exporter and Prometheus.
+ **Reactjs example for Image Classifier** - Added [example](https://github.com/pytorch/serve/issues/1109) for Reactjs Image Classifier.

Improvements
+ **Batching inference exception support** - Optimized [batching](https://github.com/pytorch/serve/pull/1272/files) to fix a concurrent modification exception that was occurring with batch inference.
+ **k8s cluster creation support upgrade** - Updated Kubernetes cluster creation scripts for v1.17 [support](https://github.com/pytorch/serve/issues/580).
+ **Nvidia devices visibility support** - Added [support](https://github.com/pytorch/serve/blob/master/docs/configuration.md#nvidia-control-visibility) for NVIDIA devices visibility.
+ **Large image support** - Added [support](https://github.com/pytorch/serve/pull/1271) for PIL.Image.MAX_IMAGE_PIXELS.
+ **Custom HTTP status support** - Added [support](https://github.com/pytorch/serve/blob/master/docs/custom_service.md#returning-custom-error-codes) to return custom http status from a model handler.
+ **TS_CONFIG_FILE env var support** - Added [support](https://github.com/pytorch/serve/issues/1257) for setting `TS_CONFIG_FILE` as env var.
+ **Frontend build optimization** - Optimized [frontend](https://github.com/pytorch/serve/issues/1306) to reduce build times by 3.7x.
+ **Warmup in benchmark** - Added [support](https://github.com/pytorch/serve/issues/1183) for warmup in benchmark scripts.

Platform Support

0.9.0

+ **Model configuration support** - Added [support](https://github.com/pytorch/serve/blob/master/docs/configuration.md#config-model) for [model performance tuning on SageMaker](https://github.com/lxning/torchserve_perf/blob/master/torchserve_perf.ipynb) via model configuration in config.properties.
+ **Serialize config snapshots to DynamoDB** - Added [support](https://github.com/pytorch/serve/blob/master/plugins/docs/ddb_endpoint.md) for serializing config snapshots to DDB.
+ **Prometheus metrics plugin support** - Added [support](https://github.com/pytorch/serve/issues/611) for Prometheus metrics plugin.
+ **Kubeflow Pipelines support** - Added support for Kubeflow pipelines and Google Vertex AI Manages pipelines, see examples [here](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/pytorch-samples)
+ **KFServing docker support** - Added [production docker](https://github.com/pytorch/serve/issues/1090) for KFServing.
+ **Python 3.9 support** - TorchServe is now certified working with Python 3.9.

Improvements
+ **HF BERT models multiple GPU support** - Added multi-gpu [support](https://github.com/pytorch/serve/pull/1141/files) for HuggingFace BERT models.
+ **Error log for customer python package installation** - Added [support](https://github.com/pytorch/serve/issues/1086) to log error of customer python package installation.
+ **Workflow documentation optimization** - [Optimized](https://github.com/pytorch/serve/pull/1154) workflow documentation.

Tooling improvements
+ **Mar file automation integration** - Integrated [mar file generation automation](https://github.com/pytorch/serve/pull/1140) into pytest and postman test.
+ **Benchmark automation for AWS neuron support** - Added [support](https://github.com/pytorch/serve/pull/1099) for AWS neuron benchmark automation.
+ **Staging binary build support** - Added [support](https://github.com/pytorch/serve/pull/1129) for staging binary build.

Platform Support

Page 3 of 7

Releases

Has known vulnerabilities

Previous Next

Torchserve

Page 3 of 7

5.1.3

2.0.0

1.14.1

1.13.1

0.10.0

0.9.0

Page 3 of 7

Links

Releases