Changelogs » Bentoml

PyUp Safety actively tracks 281,944 Python packages for vulnerabilities and notifies you when to upgrade.



Bug Fix
  This is a minor release containing one bug fix for issue 1318, where the docker build process for the BentoML API model server was broken due to an error in the init shell script. The issue has been fixed in 1319 and included in this new release.
  The reason our integration tests did not catch this issue was due to the fact that we are bundling the "dirty" BentoML installation in the generated docker file in the development environment and CI/Test environment, whereas the production release version of BentoML, uses the BentoML installed from PyPI. And the issue in 1318 was an edge case that can be triggered only when using the released version of BentoML and published docker image.  We are investigating ways to run all our integration tests with a preview release before making a final release, as part of our QA process, which should help us prevent this type of bugs from getting into final releases in the future.


New Features & Improvements
  * Improved Model Management APIs 1126 1241 by yubozhao
  Python APIs for model management:
  from bentoml.yatai.client import get_yatai_client  Save and register the bento service locally
  push to save bento service to remote yatai service.
  yc = get_yatai_client('')
  Pull bento service from remote yatai server and register locally
  yc = get_yatai_client('')
  delete in local yatai
  yatai_client = get_yatai_client()
  delete in batch by labels
  yatai_client = get_yatai_client()
  yatai_client.prune(labels='cicd=failed, framework In (sklearn, xgboost)')
  Get bento service metadata
  yatai_client.repository.get('bento_name:version', yatai_url='')
  List bento services by label
  yatai_client.repositorylist(labels='label_key In (value1, value2), label_key2 Exists', yatai_url='')
  New CLI commands for model management:
  Push local bento service to remote yatai service:
  $ bentoml push bento_service_name:version --yatai-url
  Added `--yatai-url` option for the following CLI commands to interact with remote yatai service directly:
  bentoml get
  bentoml list
  bentoml delete
  bentoml retrieve
  bentoml run
  bentoml serve
  bentoml serve-gunicorn
  bentoml info
  bentoml containerize
  bentoml open-api-spec
  * Model Metadata API 1179 shoutout to jackyzha0 for designing and building this feature!
  Ability to save additional metadata for any artifact type, e.g.:
  model_metadata = {
  'k1': 'v1',
  'job_id': 'ABC',
  'score': 0.84,
  'datasets': ['A', 'B'],
  svc.pack("model", test_model, metadata=model_metadata)
  loaded_service = bentoml.load(str(tmpdir))
  * Improved Tensorflow Support, by bojiang
  * Make the packed model behave the same as after the model was saved and loaded again 1231
  * TfTensorOutput raise TypeError when micro-batch enabled 1251
  * Opt auto casting of TfSavedModelArtifact & clearer feedback
  * Improve KerasModelArtifact to work with tf2 1295
  * Automated AWS EC2 deployment 1160 massive 3800+ line PR by mayurnewase
  * Create auto-scaling endpoint on AWS EC2 with just one command, see documentation here
  * Add MXNet Gluon support 1264 by liusy182
  * Enable input & output data capture in Sagemaker deployment 1189 by j-hartshorn
  * Faster docker image rebuild when only model artifacts are updated 1199
  * Support URL location prefix in yatai-service gRPC/Web server 1063 1184
  * Support relative path for showing Swagger UI page in the model server 1207
  * Add onnxruntime gpu as supported backend 1213
  * Add option to disable swagger UI 1244 by liusy182
  * Add label and artifact metadata display to yatai web ui 1249
  * Make bentoml module executable 1274
  python -m bentoml <subcommand>
  * Allow setting micro batching parameters from CLI 1282 by jsemric
  bentoml serve-gunicorn --enable-microbatch --mb-max-latency 3333 --mb-max-batch-size 3333 IrisClassifier:20201202154246_C8DC0A
  Bug fixes
  * Allow deleting bento that was previously deleted with the same name and version 1211
  * Construct docker API client from env 1233
  * Pin-down SqlAlchemy version 1238
  * Avoid potential TypeError in batching server 1252
  * Fix inference API docstring override by default 1302
  * Add examples of queries with requests for adapters 1202
  * Update import paths to reflect fastai2->fastai rename 1227
  * Add model artifact metadata information to the core concept page 1259
  * Update adapters.rst to include new input adapters 1269
  * Update quickstart guide 1262
  * Docs for gluon support 1271
  * Fix CURL commands for posting files in input adapters doc string 1307
  Internal, CI, and Tests
  * Fix installing bundled pip dependencies in Azure and Sagemaker deployments 1214 (affects bentoml developers only)
  * Add Integration test for Fasttext 1221
  * Add integration test for spaCy 1236
  * Add integration test for models using tf native API 1245
  * Add tests for run_api_server_docker_container microbatch 1247
  * Add integration test for LightGBM 1243
  * Update Yatai web ui node dependencies version 1256
  * Add integration test for bento management 1263
  * Add yatai server integration tests to Github CI 1265
  * Update e2e yatai service tests 1266
  * Include additional information for EC2 test 1270
  * Refactor CI for TensorFlow2 1277
  * Make tensorflow integration tests run faster 1278
  * Fix overrided protobuf version in CI 1286
  * Add integration test for tf1 1285
  * Refactor yatai service integration test 1290
  * Refactor Saved Bundle Loader 1291
  * Fix flaky yatai service integration tests 1298
  * Refine KerasModelArtifact & its integration test 1295
  * Improve API server integration tests 1299
  * Add integration tests for ragged_tensor 1303
  * We have started using [Github Projects]( feature to track roadmap items for BentoML, you can find it here:
  * We are hiring senior engineers and a lead developer advocate to join our team, let us know if you or someone you know might be interested 👉
  * Apologize for the long wait between 0.9 and 0.10 releases, we are getting back to doing our bi-weekly release schedule now!  We need help with documenting new features, writing release notes as well as QA new release before it went out, let us know if you'd be interested in helping out!
  Thank you everyone for contributing to this release! j-hartshorn withsmilo yubozhao bojiang changhw01 mayurnewase telescopic jackyzha0 pncnmnp kishore-ganesh rhbian liusy182 awalvie cathy-kim jsemric 🎉🎉🎉


Bug fixes
  * Fixed retrieving BentoService from S3/MinIO based storage 1174
  * Fixed an issue when using inference API function optional parameter `tasks` / `task` 1171




What's New
  * New input/output adapter design that let's user choose between batch or non-batch implementation
  * Speed up the API model server docker image build time
  * Changed the recommended import path of artifact classes, now artifact classes should be imported from `bentoml.frameworks.*`
  * Improved python pip package management
  * Huggingface/Transformers support!!
  * Managed packaged models with Labels API
  * Support GCS(Google Cloud Storage) as model storage backend in YataiService
  * Current Roadmap for feedback:
  New Input/Output adapter design
  A massive refactoring on BentoML's inference API and input/output adapter redesign, lead by bojiang with help from akainth015.
  **BREAKING CHANGE:** API definition now requires declaring if it is a batch API or non-batch API:
  from typings import List
  from bentoml import env, artifacts, api, BentoService
  from bentoml.adapters import JsonInput
  from bentoml.types import JsonSerializable   type annotations are optional
  class MyPredictionService(BentoService):
  api(input=JsonInput(), batch=True)
  def predict_batch(self, parsed_json_list: List[JsonSerializable]):
  results = self.artifacts.classifier([j['text'] for j in parsed_json_list])
  return results
  api(input=JsonInput())   default batch=False
  def predict_non_batch(self, parsed_json: JsonSerializable):
  results = self.artifacts.classifier([parsed_json['text']])
  return results[0]
  For APIs with `batch=True`, the user-defined API function will be required to process a list of input item at a time, and return a list of results of the same length.  On the contrary, `api` by default uses `batch=False`, which processes one input item at a time.  Implementing a batch API allow your workload to benefit from BentoML's adaptive micro-batching mechanism when serving online traffic, and also will speed up offline batch inference job.  We recommend using `batch=True` if performance & throughput is a concern. Non-batch APIs are usually easier to implement, good for quick POC, simple use cases, and deploying on Serverless platforms such as AWS Lambda, Azure function, and Google KNative.
  Read more about this change and example usage here:
  **BREAKING CHANGE:** For `DataframeInput` and `TfTensorInput` users, it is now required to add `batch=True`
  DataframeInput and TfTensorInput are special input types that only support accepting a batch of input at one time.
  Input data validation while handling batch input
  When the API function received a list of input, it is now possible to reject a subset of the input data and return an error code to the client, if the input data is invalid or malformated. Users can do this via the `InferenceTaskdiscard` API,  here's an example:
  from typings import List
  from bentoml import env, artifacts, api, BentoService
  from bentoml.adapters import JsonInput
  from bentoml.types import JsonSerializable, InferenceTask   type annotations are optional
  class MyPredictionService(BentoService):
  api(input=JsonInput(), batch=True)
  def predict_batch(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]):
  model_input = []
  for json, task in zip(parsed_json_list, tasks):
  if "text" in json:
  task.discard(http_status=400, err_msg="input json must contain `text` field")
  results = self.artifacts.classifier(model_input)
  return results
  The number of tasks got discarded plus the length of the results array returned, should be equal to the length of the input list, this will allow BentoML to match the results back to tasks that have not yet been discarded.
  Allow fine-grained control of the HTTP response, CLI inference job output, etc. E.g.:
  import bentoml
  from bentoml.types import JsonSerializable, InferenceTask, InferenceError   type annotations are optional
  class MyService(bentoml.BentoService):
  bentoml.api(input=JsonInput(), batch=False)
  def predict(self, parsed_json: JsonSerializable, task: InferenceTask) -> InferenceResult:
  if task.http_headers['Accept'] == "application/json":
  predictions = self.artifact.model.predict([parsed_json])
  return InferenceResult(
  http_headers={"Content-Type": "application/json"},
  return InferenceError(err_msg="application/json output only", http_status=400)
  Or when batch=True:
  import bentoml
  from bentoml.types import JsonSerializable, InferenceTask, InferenceError   type annotations are optional
  class MyService(bentoml.BentoService):
  bentoml.api(input=JsonInput(), batch=True)
  def predict(self, parsed_json_list: List[JsonSerializable], tasks: List[InferenceTask]) -> List[InferenceResult]:
  rv = []
  predictions = self.artifact.model.predict(parsed_json_list)
  for task, prediction in zip(tasks, predictions):
  if task.http_headers['Accept'] == "application/json":
  http_headers={"Content-Type": "application/json"},
  rv.append(InferenceError(err_msg="application/json output only", http_status=400))
  or task.discard(err_msg="application/json output only", http_status=400)
  return rv
  Other adapter changes:
  * Added a 3 base adapters for implementing advanced adapters: FileInput, StringInput, MultiFileInput
  * Implementing new adapters that support micro-batching is a lot easier now:
  * Per inference task prediction log 1089
  * More adapters support launching batch inference job from BentoML CLI run command now, see API reference for detailed examples:
  Docker Build Improvements
  * Optimize docker image build time (1081) kudos to ZeyadYasser!!
  * Per python minor version base image to speed up image building 1101 1096, thanks gregd33!!
  * Add "latest" tag to all user-facing docker base images (1046)
  Improved pip package management
  Setting pip install options in BentoService `env` specification
  As suggested here:, Thanks danield137 for suggesting the `pip_extra_index_url` option!
  class IrisClassifier(BentoService):
  **BREAKING CHANGE** Due to this change, we have now removed the previous docker build arg PIP_INDEX_URL and ARG PIP_TRUSTED_HOST, due to it may be conflicting with settings in base image 1036
  * Support passing a conda environment.yml file to `env`, as suggested in 725
  * When a version is not specified in pip_packages list, it is expected to pin to the version found in the current python session. Now it is doing the same for packages added from adapter and artifact classes
  * Support specifying package requirement range now, e.g.:
  env(pip_packages=["abc==1.3", "foo>1.2,<=1.4"])
  It can be any pip version requirement specifier
  * Renamed `pip_dependencies` to `pip_packages` and `auto_pip_dependencies` to `infer_pip_packages`, the old API still works but will eventually be deprecated.
  GCS support in YataiService
  Adding Google Cloud Storage (GCS) support in YataiService, as the storage backend. This is an alternative to AWS S3, MiniIO, or POSIX file system. 1017 - Thank you Korusuke PrabhanshuAttri for creating the GCS support!
  YataiService Labels API for model management
  Managed packaged models in YataiService with labels API implemented in 1064
  1. Add labels to ``
  svc = MyBentoService(){'my_key': 'my_value', 'test': 'passed'})
  2. Add label query for CLI commands
  * `bentoml get BENTO_NAME`, `bentoml list`, `bentoml deployment list`, `bentoml lambda list`, `bentoml sagemaker list`, `bentoml azure-functions list`
  * label query supports `=`, `!=`, `In`, `NotIn`, `Exists`, `DoesNotExists` operator
  -   e.g.   key1=value1, key2!=value2, env In (prod, staging), Key Exists, Another_key DoesNotExist
  *Simple key/value label selector*
  <img width="1329" alt="Screen Shot 2020-09-03 at 5 38 21 PM" src="">
  *Use Exists operator*
  <img width="1123" alt="Screen Shot 2020-09-03 at 5 40 57 PM" src="">
  *Use DoesNotExist operator*
  <img width="1327" alt="Screen Shot 2020-09-03 at 5 41 41 PM" src="">
  *Use In operator*
  <img width="1348" alt="Screen Shot 2020-09-03 at 5 48 42 PM" src="">
  *Use multiple label query*
  <img width="1356" alt="Screen Shot 2020-09-03 at 7 07 23 PM" src="">
  3. Roadmap - add web UI for filtering and searching with labels API
  New framework support: Huggingface/Transformers
  1090 1094 thanks vedashree29296 for contributing this!
  Usage & docs:
  Bug Fixes:
  * Fixed 1030 - bentoml serve fails when packaged on Windows and deployed on Linux 1044
  * Handle missing region during SageMaker deployment updates 1049
  Internal & Testing:
  * Re-organize artifacts related modules 1082, 1085
  * Refactoring & improvements around dependency management 1084, 1086
  * [TEST/CI] Add tests covering XgboostModelArtifact (1079)
  * [TEST/CI] Fix AWS moto related unit tests (1077)
  * Lock SQLAlchemy-utils version (1078)
  Contributors of 0.9.0 release
  Thank you all for contributing to this release!! danield137 ericmand ssakhavi aviaviavi dinakar29 umihui vedashree29296 joerg84 gregd33 mayurnewase narennadig akainth015 yubozhao bojiang


What's New
  Yatai service helm chart for Kubernetes deployment [945]( jackyzha0
  Helm chart offers a convenient way to deploy YataiService to a Kubernetes cluster
  Download BentoML source
  $ git clone
  $ cd BentoML
  1. Install an ingress controller if your cluster doesn't already have one, Yatai helm chart installs nginx-ingress by default:
  $ helm repo add ingress-nginx && helm dependencies build helm/YataiService
  2. Install YataiService helm chart to the Kubernetes cluster:
  $ helm install -f helm/YataiService/values/postgres.yaml yatai-service YataiService
  3. To uninstall the YataiService from your cluster:
  $ helm uninstall yatai-service
  jackyzha0 added a great tutorial about YataiService helm chart deployment. You can find the guide at
  [Experimental] AnnotatedImageInput adapter for image plus additional JSON data [973]( ecrows
  The AnnotatedImageInput adapter is designed for the common use-cases of image input to include additional information such as object detection bounding boxes, segmentation masks, etc. for prediction. This new adapter significantly improves the developer experience over the previous workaround solution.
  **Warning:** Input adapters are currently under refactoring [1002](, we may change the API for AnnotatedImageInput in future releases.
  from bentoml.adapters import AnnotatedImageInput
  from bentoml.artifact import TensorflowSavedModelArtifact
  import bentoml
  CLASS_NAMES = ['cat', 'dog']
  class PetClassification(bentoml.BentoService):
  def predict(self, image, annotations):
  cropped_pets = some_pet_finder(image, annotations)
  results = self.artifacts.classifier.predict(cropped_pets)
  return [CLASS_NAMES[r] for r in results]
  Making a request using `curl`
  $ curl -F image=image.png -F annotations=annotations.json http://localhost:5000/predict
  You can find the current API reference at
  * [992]( Make the prediction and feedback loggers log to console by default - jackyzha0
  * [952]( Add tutorial for deploying BentoService to Azure SQL server to the documentation yashika51
  Bug Fixes:
  * [987]( & [#991]( Better AWS IAM roles handles for Sagemaker Deployment  - dinakar29
  * [995]( Fix an edge case for encountering RecursionError when running gunicorn server with `--enable-microbatch` on MacOS bojiang
  * [1012]( Fix ruamel.yaml missing issue when using containerized BentoService with Conda. parano
  Internal & Testing:
  * [983]( Move CI tests to Github Actions
  Thank you, everyone, for contributing to this exciting release!
  bojiang jackyzha0 ecrows dinakar29 yashika51 akainth015


Bug fixes
  * API server show blank index page 977 975
  * Failed to package pip installed dependencies in some edge cases 978 979


What's New
  Breaking Change: JsonInput migrating to batch API 860,953
  We are officially changing JsonInput to use the batch-oriented syntax. By now(release 0.8.4), all input adapters in BentoML have migrated to this design. The main difference is that for the user-defined API function, the input parameter is now a list of JSONSerializable objects(Dict, List, Integer, Float, Str) instead of one JSONSerializable object. And the expected return value of the user-defined API function is an Iterable with the exact same length.  This makes it possible for API endpoints using JsonInput adapter to take advantage of BentoML's adaptive micro-batching capability.
  Here is an example of how JsonInput(formerly JsonHandler) used to work:
  def predict(self, parsed_json):
  results = self.artifacts.classifier([parsed_json['text']])
  return results[0]
  And here is an example with the new JsonInput class:
  def predict(self, parsed_json_list):
  texts = [j['text'] for j in parsed_json_list])
  return self.artifacts.classifier(texts)
  The old non-batching JsonInput is still available to help with the transition, simply use `from bentoml.adapters import LegacyJsonInput as JsonInput` to replace the JsonInput or JsonHandler in your code before BentoML 0.8.4. The `LegacyJsonInput` behaves exactly the same as JsonInput in previous releases. We will keep supporting it until BentoML version 1.0.
  Custom Web UI support in API Server (839)
  Custom web UI can be added to your API server now! Here is an example project:
  ![bentoml custom web ui](
  Add your web frontend project directory to your BentoService class and BentoML will automatically bundle all the web UI files and host them when starting the API server:
  class IrisClassifier(BentoService):
  def predict(self, df):
  return self.artifacts.model.predict(df)
  Artifact packing & loading workflow 911, 921, 949
  We have refactored the Artifact API, which brings more flexibility to how users package their trained models with BentoML's API.
  The most noticeable thing a user can do now is to separate model training job and BentoML model serving development - the user can now use the Artifact API to save a trained model from their training job and load it later for creating BentoService class for model serving. e.g.:
  Step 1, model training:
  from sklearn import svm
  from sklearn import datasets
  from bentoml.artifact import SklearnModelArtifact
  if __name__ == "__main__":
  Load training data
  iris = datasets.load_iris()
  X, y =,
  Model Training
  clf = svm.SVC(gamma='scale'), y)
  save just the trained model  with the SklearnModelArtifact to a specific directory
  btml_model_artifact = SklearnModelArtifact('model')
  Step 2: Build BentoService class with the saved artifact:
  from bentoml import env, artifacts, api, BentoService
  from bentoml.adapters import DataframeInput
  from bentoml.artifact import SklearnModelArtifact
  class IrisClassifier(BentoService):
  def predict(self, df):
  Optional pre-processing, post-processing code goes here
  return self.artifacts.model.predict(df)
  if __name__ == "__main__":
  Create a iris classifier service instance
  iris_classifier_service = IrisClassifier()
  load the previously saved artifact
  saved_path =
  This workflow makes developing and debugging BentoService code a lot easier, user no longer needs to retrain their model every time they change something in the BentoService class definition and wants to try it out.
  * Note that the old BentoService class method 'pack' has now been deprecated in this release 915
  Add `bentoml containerize` command (847,884,941)
  $ bentoml containerize --help
  Usage: bentoml containerize [OPTIONS] BENTO
  Containerizes given Bento into a ready-to-use Docker image.
  -p, --push
  -t, --tag TEXT       Optional image tag. If not specified, Bento will
  generate one from the name of the Bento.
  Support multiple images in the same request (828)
  A new input adapter class `MultiImageInput` has been added. It is designed for prediction services that require multiple image files as its input:
  from bentoml import BentoService
  import bentoml
  class MyService(BentoService):
  bentoml.api(input=MultiImageInput(input_names=('imageX', 'imageY')))
  def predict(self, image_groups):
  for image_group in image_groups:
  image_array_x = image_group['imageX']
  image_array_y = image_group['imageY']
  Add FileInput adapter(734)
  A new input adapter class `FileInput` for handling arbitrary binary files as the input for your prediction service
  Added Ngrok support (917)
  Expose your local development model API server over a public URL endpoint, using Ngrok under the hood. To try it out, simply add the `--run-with-ngrok` flag to your `bentoml serve` CLI command, e.g.:
  bentoml serve IrisClassifier:latest --run-with-ngrok
  Add support for CoreML (939)
  Serving CoreML model on Mac OS is now supported! Users can also convert their models trained with other frameworks to the CoreML format, for better performance on Mac OS platforms. Here's an example with Pytorch model serving with CoreML and BentoML:
  import torch
  from torch import nn
  class PytorchModel(nn.Module):
  def __init__(self):
  self.linear = nn.Linear(5, 1, bias=False)
  def forward(self, x):
  x = self.linear(x)
  return x
  import numpy
  import pandas as pd
  from coremltools.models import MLModel   pylint: disable=import-error
  import bentoml
  from bentoml.adapters import DataframeInput
  from bentoml.artifact import CoreMLModelArtifact
  class CoreMLClassifier(bentoml.BentoService):
  def predict(self, df: pd.DataFrame) -> float:
  model: MLModel = self.artifacts.model
  input_data = df.to_numpy().astype(numpy.float32)
  output = model.predict({"input": input_data})
  return next(iter(output.values())).item()
  def convert_pytorch_to_coreml(pytorch_model: PytorchModel) -> ct.models.MLModel:
  """CoreML is not for training ML models but rather for converting pretrained models
  and running them on Apple devices. Therefore, in this train we convert the
  pretrained PytorchModel from the tests.integration.test_pytorch_model_artifact
  module into a CoreML module."""
  traced_pytorch_model = torch.jit.trace(pytorch_model, torch.Tensor(test_df.values))
  model: MLModel = ct.convert(
  traced_pytorch_model, inputs=[ct.TensorType(name="input", shape=test_df.shape)]
  return model
  if __name__ == '__main__':
  svc = CoreMLClassifier()
  pytorch_model = PytorchModel()
  model = convert_pytorch_to_coreml(pytorch_model)
  svc.pack('model', model)
  Breaking Change: Remove CLI --with-conda option 898
  Run inference job within an automatically generated conda environment seems like a good idea at first but we realized it introduces more problems than it solves. We are removing this option and encourage users to use docker for running inference jobs instead.
  * 966, 968 Faster `save` by improving python local module parsing code
  * 878, 879 Faster `import bentoml` with lazy module loader
  * 872 Add BentoService API name validation
  * 887 Set a smaller page limit for `bentoml list`
  * 916 Do not cache pip requirements in Dockerfile
  * 918 Improve error handling when micro batching service is unavailable
  * 925 Artifact refactoring: set_dependencies method
  * 932 Add warning for SavedBundle Python version mismatch
  * 904 JsonInput handle AWS Lambda event should ignore content type header
  * 951 Add openjdk to H2O artifact default conda dependencies
  * 958 Fix typo in cli default argument help message
  Bug fixes:
  * 864 Fix decode headers with latin1
  * 867 Fix DataFrameInput passing NaN values over HTTP JSON request
  * 869 Change the default mb_max_latency value to avoid flaky micro-batching initialization
  * 897 Fix yatai web client import
  * 907 Fix CORS option in AWS Lambda SAM config
  * 922 Fix lambda deployment when using AWS assumed-role ARN
  * 959 Fix `RecursionError: maximum recursion depth exceeded` when saving BentoService bundle
  * 969 Fix error in CLI command `bentoml --version`
  Internal & Testing
  * 870 Add docs for using BentoML's built-in benchmark client
  * 855, 871, 877 Add integration tests for dockerized BentoML API server workflow
  * 876, 937 Add integration test for Tensorflow SavedModel artifact
  * 951 H2O artifact integration test
  * 939 CoreML artifact integration test
  * 865 add makefile for BentoML developers
  * 868 API Server "/feedback" endpoint refactor
  * 908 BentoService base class refactoring and docstring improvements
  * 909 Refactor API Server startup
  * 910 Refactor API server performance tracing
  * 906 Fix yatai web ui startup script
  * 875 Increate micro batching server test coverage
  * 935 Fix list deployments error response
  Community Announcements:
  We have enabled __Github Discussions__ feature🎉
  This will be a new place for community members to connect, ask questions, and share anything related to model serving and BentoML.
  Thank you, everyone, for contributing to this amazing release loaded with new features and improvements! bojiang joshuacwnewton  guy4261 Sharathmk99 co42 jackyzha0 Korusuke akainth015 omrihar yubozhao


* Fix: 500 Error without message when micro-batch enabled 857
  * Fix: port conflict with --debug flag 858
  * Permission issue while building docker image for BentoService created under Windows OS 851


What's New?
  * Support Debian-slim docker images for containerizing model server, 822 by jackyzha0. User can choose to use :
  * New `bentoml retrieve` command for downloading saved bundle from remote YataiService model registry, 810 by iancoffey
  bentoml retrieve ModelServe:20200610145522_D08399 --target_dir /tmp/modelserve
  * Added `--print-location` option to `bentoml get` command to print the saved path, 825 by jackyzha0
  $ bentoml get IrisClassifier:20200625114130_F3480B --print-location
  * Support Dataframe input JSON format orient parameter. DataframeInput now supports all pandas JSON orient options: records, columns, values split, index. 809 815, by bojiang
  For example, with `orient="records"`:
  def predict(self, df):
  The API endpoint will be expecting HTTP request with JSON payload in the following format:
  [{"col 1":"a","col 2":"b"},{"col 1":"c","col 2":"d"}]
  Or with `orient="index"`:
  '{"row 1":{"col 1":"a","col 2":"b"},"row 2":{"col 1":"c","col 2":"d"}}'
  See pandas's documentation on the orient option of to_json/from_json function for more detail:
  * Support Azure Functions deployment (beta). A new fully automated cloud deployment option that BentoML provides in addition to AWS SageMaker and AWS Lambda. See usage documentation here:
  * ModelServer API Swagger schema improvements including the ability to specify example HTTP request, 807 by Korusuke
  * Add prediction logging when deploying with AWS Lambda, 790 by jackyzha0
  * Artifact string name validation, 817 by AlexDut
  * Fixed micro batching parameter(max latency and max batch size) not applied, 818 by bojiang
  * Fixed issue with handling CSV file input by following RFC4180. 814 by bojiang
  * Fixed TfTensorOutput casts floats as ints 813, in 823 by bojiang
  * The BentoML team has created a new [mailing list](!forum/bentoml) for future announcements, community-related discussions. Join now [here](!forum/bentoml)!
  * For those interested in contributing to BentoML, there is a new [contributing docs]( now, be sure to check it out.
  * We are starting a bi-weekly community meeting for community members to demo new features they are building, discuss the roadmap and gather feedback, etc. More details will be announced soon.


What's New?
  * Service API Input/Output adapter 783 784 789, by bojiang
  * A new API for defining service input and output data types and configs
  * The new `InputAdapter` is essentially the `API Handler` concept in BentoML prior to version 0.8.x release
  * The old `API Handler` syntax is being deprecated, it will continue to be supported until version 1.0
  * The main motivation for this change, is to enable us to build features such as new API output types(such as file/image as service output), add gRPC support, better OpenAPI support, and more performance optimizations in online serving down the line
  * Model server docker image build improvements 761
  * Reduced docker build time by using a pre-built BentoML model server docker image as the base image
  * Removed the dependency on `apt-get` and `conda` from the custom docker base image
  * Added alpine based docker image for model server deployment
  * Improved Image Input handling:
  * Add micro-batching support for ImageInput (former ImageHandler) 717, by bojiang
  * Add support for using a list of images as input from CLI prediction run 731, by bojiang
  * In the new Input Adapter API introduced in 0.8.0, the  `LegacyImageInput` is identical to the previous `ImageHandler`
  * The new `ImageInput` works only for single image input, unlike the old `ImageHandler`
  * For users using the old `ImageHandler`, we recommend migrating to the new `ImageInput` if it is only used to handle single image input
  * For users using `ImageHanlder` for multiple images input, wait until the `MultiImageInput` is added, which will be a separate input adapter type
  * Added CORS support for AWS Lambda serving 752, by omrihar
  * Added JsonArtifact for storing configuration and JsonSerializable data 746, by lemontheme
  Bug Fixes & Improvements:
  * Fixed Sagemaker deployment `ModuleNotFounderError` due to wrong gevent version 785 by flosincapite
  * Fixed SpacyModelArtifact not exposed in `bentoml.artifacts` 782, by docteurZ
  * Fixed errors when inheriting handler 767, by bojiang
  * Removed `future` statements for py2 support, 756, by jjmachan
  * Fixed bundled_pip_dependencies installation on AWS Lambda deployment 794
  * Removed `aws.region` config, use AWS CLI's own config instead 740
  * Fixed SageMaker deployment CLI: delete deployment with namespace specified 741
  * Removed `pandas` from BentoML dependencies list, it is only required when using DataframeInput 738
  Internal, CI, Testing:
  * Added docs watch script for Linux 781, by akainth015
  * Improved build bash scripts 774, by akainth015, flosincapite
  * Fixed YataiService end-to-end tests 773
  * Added PyTorch integration tests 762, by jjmachan
  * Added ONNX integration tests 726, by yubozhao
  * Added linter and formatting check to Travis CI
  * Codebase cleanup, reorganized deployment and repository module 768 769 771
  * The BentoML team is planning to start a bi-weekly community meeting to demo new features, discuss the roadmap and gather feedback. Join the BentoML slack channel for more details: [click to join BentoML slack](
  * There are a few issues with PyPI release `0.8.0` that made it not usable. The newer `0.8.1` release has those issues fixed. Please do not use version `0.8.0`.


What's New?
  * ONNX model support with onnxruntime backend. More example notebooks and tutorials are coming soon!
  * Added Python 3.8 support
  * BentoML API Server architecture overview
  * Deploying YataiService behind Nginx
  * [benchmark] moved benchmark notebooks it a separate repo:
  * [CI] Enabled Linting style check test on Travis CI, contributed by kautukkundan
  * [CI] Fixed all existing linting errors in bentoml and tests module, contributed by kautukkundan
  * [CI] Enabled Python 3.8 on Travis CI
  * There will be breaking changes in the coming 0.8.0 release, around ImageHandler, custom Handler and custom Artifacts. If you're using those features in production, please reach out.
  * Help us promote BentoML on [Twitter bentomlai]( and [Linkedin Page](!
  * Be sure to join the BentoML slack channel for roadmap discussions and development updates, [click to join BentoML slack](


What's New?
  * Support custom docker base image, contributed by withsmilo
  * Improved model saving & loading with YataiService backed by S3 storage, contributed by withsmilo, BentoML now works with custom s3-like services such as a MinIO deployment
  Improvements & Bug Fixes
  * Fixed a number of issues that are breaking Windows OS support, contributed by bojiang
  * [YataiService] Fixed an issue where the deployment namespace configured on the server-side will be ignored
  * [CI] Added Windows test environment in BentoML's CI test setup on Travis
  * Help us promote BentoML on [Twitter bentomlai]( and [Linkedin Page](!
  * Be sure to join the BentoML slack channel for roadmap discussions and development updates, [click to join BentoML slack](


What's New?
  * Added Spacy Support, contributed by spotter (641)
  * Support custom s3_endpoint_url in BentoML’s model registry component(YataiService) (656)
  * YataiService client can now connect via secure gRPC (650)
  Improvements & Bug Fixes
  * Micro-batching server performance optimization & troubleshoot back pressure (630)
  * [YataiService] Included postgreSQL required dependency in the YataiService docker image by default
  * [Documentation] New fastest example project
  * [Bug Fix] Fixed overwriting pip_dependencies specified through env (657 642)
  * [Benchmark] released newly updated benchmark notebook with latest changes in micro batching server
  * [Benchmark] notebook updates and count dropped requests (645)
  * [e2e test] Added e2e test using dockerized YataiService gRPC server


What's new:
  * Added FastAI2 support, contributed by HenryDashwood
  Bug fixes:
  * S3 bucket creation in us-east-1 region
  * Fix issue with fastcore and ruamel-yaml
  Documentation updates:
  * Added Kubeflow deployment guide
  * Added Kubernetes deployment guide
  * Added Knative deployment guide


* Added support for [Fasttext]( models, contributed by GCHQResearcher83493
  * Fixed Windows compatibility while packaging model, contributed by codeslord
  * Added benchmark using Tensorflow-based Bert model
  * Fixed an issue with pip installing a BentoService saved bundle with the new release of pip `pip==20.1`
  * AWS ECS deployment guide
  * Heroku deployment guide:
  * Knative deployment guide:


  * Added `--timeout` option to SageMaker deployment creation command
  * Fixed an issue with the new GRPCIO PyPI release when deploying to AWS Lambda
  * Revamped the Core Concept walk-through documentation
  * Added notes on using micro-batching and deploying YataiService


Introducing 2 Major New Features
  * Adaptive micro-batching mode in API server
  * Web UI for model and deployment management
  Adaptive Micro Batching
  Adaptive micro-batching is a technique used in advanced serving system, where prediction requests coming in are grouped into small batches for inference. With version 0.7.2, we've implemented Micro Batching mode for API server, and all existing BentoService can benefit from this by simply enable it via the `--enable-microbatch` flag or `BENTOML_ENABLE_MICROBATCH` environment variable when running API server docker image:
  $ bentoml serve-gunicorn IrisClassifier:latest --enable-microbatch
  $ docker run -p 5000:5000 -e BENTOML_ENABLE_MICROBATCH=True iris-classifier:latest
  Currently, the micro-batch mode is only effective for DataframeHandler, JsonHandler, and TensorflowTensorHandler.  We are working on support for ImageHandler, along with a few new handler types coming in the next release.
  Model Management Web UI
  BentoML has a standalone component YataiService that handles model storage and deployment via gRPC calls. By default, BentoML launches a local YataiService instance when being imported. This local YataiService instance saves BentoService files to `~/bentoml/repository/` directory and other metadata to `~/bentoml/storage.db`.
  In release 0.7.x, we introduced a new CLI command for running YataiService as a standalone service that can be shared by multiple bentoml clients. This makes it easy to share, use and discover models and serving deployments created by others in your team.
  To play with the YataiService gRPC & Web server, run the following command:
  $ bentoml yatai-service-start
  $ docker run -v ~/bentoml:/bentoml -p 3000:3000 -p 50051:50051 bentoml/yatai-service:0.7.2 --db-url=sqlite:///bentoml/storage.db --repo-base-url=/bentoml/repository
  For team settings, we recommend using a remote database instance and cloud storage such as s3 for storage. E.g.:
  $ docker run -p 3000:3000 -p 50051:50051 \
  bentoml/yatai-service:0.7.2 \
  --db-url postgresql://scott:tigerlocalhost:5432/bentomldb \
  --repo-base-url s3://my-bentoml-repo/
  <img width="1288" alt="yatai-service-web-ui-repository" src="">
  <img width="962" alt="yatai-service-web-ui-repository-detail" src="">
  Documentation Updates
  * Added a new section working through all the main concepts and best practices using BentoML, we recommend it as a must-read for new BentoML users
  * BentoML Core Concepts:
  Version 0.7.0 and 0.7.1 are not recommended due to an issue with including the Benchmark directory in its PyPI distribution. But other than that, they are identical to version 0.7.2.


New Features:
  * Automatically discover all pip dependencies via `env(auto_pip_dependencies=True)`
  * CLI command auto-completion support
  Beta Features:
  Contact us via [Slack]( for early access and documentation related to these features.
  * Adaptive micro-batching in BentoML API server, including performance tracing and benchmark
  * Standalone YataiService gRPC server for model management and deployment
  Improvements & Bug fixes
  * Improved end-to-end tests, covering entire BentoML workflow
  * Fixed issues with using YataiService with PostgreSQL databases as storage
  * `bentoml delete` command now supports deleting multiple BentoService at once, see `bentoml delete --help`


  * [ISSUE-505] Make "application/json" the default Content-Type in DataframeHandler 507
  * CLI improvement - Add bento service as column for deployment list 514
  * SageMaker deployment - error reading Azure user role info 510 by HenryDashwood
  * BentoML cli improvments 520, 519
  * Add handler configs to BentoServiceMetadata proto and bentoml.yml file  517
  * Add support for list by labels 521
  Bug fixes:
  * [ISSUE-512] Fix appending saved path to sys.path when loading BentoService 518
  * Lambda deployment - ensure requirement dir in PYTHONPATH 508
  * SageMaker deployment delete - fix error when endpoint already deleted 522


* Bugfix: `bentoml serve-gunicorn` command was broken in 0.6.0, which also breaks the API Server docker container. This is a minor release including a fix this issue


The biggest change in release 0.6.0 is revamped BentoML CLI, introducing new model/deployment management commands and new syntax for CLI inferencing.
  1. New commands for managing your model repository:
  > bentoml list
  BENTO_SERVICE                                   CREATED_AT        APIS                                                   ARTIFACTS
  IrisClassifier:20200123004254_CB6865            2020-01-23 08:43  predict::DataframeHandler                              model::SklearnModelArtifact
  IrisClassifier:20200122010013_E0292E            2020-01-22 09:00  predict::DataframeHandler                              clf::PickleArtifact
  > bentoml get IrisClassifier
  > bentoml get IrisClassifier:20200123004254_CB6865
  > bentoml get IrisClassifier:latest
  2. Add support for using saved BentoServices by `name:version` tag instead of {saved_path}, here are some example commands:
  > bentoml serve {saved_path}
  > bentoml serve IrisClassifier:latest
  > bentoml serve IrisClassifier:20200123004254_CB6865


* Fixed an issue with API server docker image build, where updating conda to newly released version causes the build to fail
  * Documentation updates
  * Removed the option to configure API endpoint output format by setting the HTTP header


* SageMaker model serving deployment improvements:
  * Added num_of_gunicorn_workers_per_instance deployment option
  * Gunicorn worker count can be set automatically based on host CPU now
  * Improved testing for SageMaker model serving deployment


Minor bug fixes:
  * AWS Lambda deployment - fix default namespace validation error (452)
  * Tensorflow SavedModel artifact - use concrete_function instead of input auto-reshape (451)


* Minor bug fixes for AWS Lambda deployment creation and error handling


* Prometheus metrics improvements in API server
  * metrics now work in multi-process mode when running with gunicorn
  * added 3 default metrics including to BentoAPIServer:
  * request latency
  * request count total labeled by status code
  * request gauge for monitoring concurrent prediction requests
  * New Tensorflow TensorHandler!
  * Receive tf.Tensor data structure within your API function for your Tensorflow and Keras model
  * TfSavedModel now can automatically transform input tensor shape based on tf.Function input signature
  * Largely improved error handling in API Server
  * Proper HTTP error code and message
  * Protect user error details being leaked to client-side
  * Introduced Exception classes for users to use when customizing BentoML handlers
  * Deployment guide on Google Cloud Run
  * Deployment guide on AWS Fargate ECS
  * AWS Lambda deployment improvements


* New LightGBM support, contributed by 7lagrange
  * LightGBM example notebook:
  * Minor AWS Lambda deployment improvements
  * Improved error message when docker or sam-cli not available
  * Pinned aws-sam-cli version to 0.33.1


* New improved AWS Lambda support!
  * Support uploading large model files to s3 when deploying to AWS Lambda
  * Support trimming down the size of bundled python dependencies
  * Support setting memory size up to 3008MB for Lambda function
  * Support updating Lambda deployment to a newer version of saved BentoService bundle
  * Fixed an issue when installing BentoService saved bundle as PyPI package, the file failed to parse requirements.txt as install_requires filed.


* Improved deployment support
  * Work seemlessly with clipper v0.4.1 release, updated deployment guide
  * New S3 based repository
  * BentoML users can now save to, load from BentoService bundle on S3 storage and deploy those bundles directly
  * Deployment python APIs are now available in Beta
  * `from bentoml.yatai.python_api import create_deployment`


> bentoml get IrisClassifier:latest
  3. Separated deployment commands to sub-commands
  AWS Lambda model serving deployment:
  AWS Sagemaker model serving deployment:
  4. Breaking Change: Improved `bentoml run` command for inferencing from CLI
  Changing from:
  > bentoml {API_NAME} {saved_path} {run_args}
  > bentoml predict {saved_path} --input=my_test_data.csvo:
  > bentoml run {BENTO/saved_path} {API_NAME} {run_args}
  > bentoml run IrisClassifier:latest predict --input='[[1,2,3,4]]'
  previous users can directly use the API name as the command to load and run a model API from cli, it looks like this: `bentoml predict {saved_path} --input=my_test_data.csv`. The problem is that the API names are dynamically loaded and this makes it hard for bentoml command to provide useful `--help` docs. And the `default command` workaround with Click, makes it very confusing when the user types a wrong command. So we decided to make this change.
  5. Breaking Change: `--quiet` and `--verbose` options position
  Previously both `--quiet` and `--verbose` options must follow immediately after `bentoml` command, now they are being added to options list of all subcommands.
  If you are using these two options, you will need to change your CLI from:
  > bentoml --verbose serve ...
  > bentoml  serve ... --verbose