Skip to content
Kai
GitHubDiscord

ModelRuntime

Similar to KServe, Kai adds support for inference workloads via a ModelRuntime resource. This resource defines the container runtime for a particular model format. You can then leverage this within your Pipeline resources by simply specifying a model format and uri to the model. Kai will handle pulling and mounting the model and exposing the proper endpoint.

Below is an example of a ModelRuntime resource for the Pytorch model format using torchserve.

apiVersion: core.kai.io/v1alpha1
kind: ModelRuntime
metadata:
  name: pytorch-runtime
spec:
  supportedModelFormats:
  - pytorch
  containers:
  - name: kai-container
    image: "pytorch/torchserve-kfs:0.7.0"
    args: ["torchserve", "--start", "--model-store=/mnt/models/model-store", "--ts-config=/mnt/models/config/config.properties"]
    ports:
    - containerPort: 8085

By applying this resource to the cluster Kai will leverage this runtime definiton for any models that specify the pytorch modelFormat.

Below is an exmaple of adding a model as part of an inference step within a pipeline.

apiVersion: core.kai.io/v1alpha1
kind: Pipeline
metadata:
  name: image-classifier
spec:
  steps:
  - spec:
      model:
        modelFormat: pytorch
        uri: gs://kfserving-examples/models/torchserve/image_classifier/v1

Kai will handle pulling and mounting the model in the container as well as exposing the service to be leveraged within the pipeline as specified by the ModelRuntime