ModelRuntime

Similar to KServe, Kai adds support for inference workloads via a ModelRuntime resource. This resource defines the container runtime for a particular model format. You can then leverage this within your Pipeline resources by simply specifying a model format and uri to the model. Kai will handle pulling and mounting the model and exposing the proper endpoint.

Below is an example of a ModelRuntime resource for the Pytorch model format using torchserve.

apiVersion: core.kai.io/v1alpha1
kind: ModelRuntime
metadata:
  name: pytorch-runtime
spec:
  supportedModelFormats:
  - pytorch
  containers:
  - name: kai-container
    image: "pytorch/torchserve-kfs:0.7.0"
    args: ["torchserve", "--start", "--model-store=/mnt/models/model-store", "--ts-config=/mnt/models/config/config.properties"]
    ports:
    - containerPort: 8085

By applying this resource to the cluster Kai will leverage this runtime definiton for any models that specify the pytorch modelFormat.

Below is an exmaple of adding a model as part of an inference step within a pipeline.

apiVersion: core.kai.io/v1alpha1
kind: Pipeline
metadata:
  name: image-classifier
spec:
  steps:
  - spec:
      model:
        modelFormat: pytorch
        uri: gs://kfserving-examples/models/torchserve/image_classifier/v1

Kai will handle pulling and mounting the model in the container as well as exposing the service to be leveraged within the pipeline as specified by the ModelRuntime

Guides

Reference

Features

About

ModelRuntime