Get started with Charmed MLflow and Kubeflow

Component

Version

MLflow

2.22

Kubeflow

1.10

This tutorial gets you started with Charmed MLflow integrated with Charmed Kubeflow (CKF).

Requirements

This guide assumes you are deploying Kubeflow and MLflow on a public cloud Virtual Machine (VM) with the following specifications:

  • Runs Ubuntu 22.04 or later.

  • Has at least 4 cores, 32GB RAM and 200GB of disk space available.

Your machine should also have an SSH tunnel open to the VM with port forwarding and a SOCKS proxy. See How to setup SSH VM Access for more details.

Note

This tutorial assumes you are running all commands on the VM, through the open SSH tunnel. Also that you are using the web browser on your local machine to access the Kubeflow and MLflow dashboards.

Deploy MLflow

Follow the steps in this tutorial to deploy MLflow on your VM: Get started with Charmed MLflow. Before moving on with this tutorial, confirm that you have access to the MLflow User Interface (UI) on http://localhost:31380.

Deploy Kubeflow

To deploy Kubeflow along MLflow, run the following:

juju deploy kubeflow --trust  --channel=1.10/stable

Once the deployment is completed, you will see this message:

Deploy of bundle completed.

Note

The bundle components need some time to initialise and establish communication with each other. This process may take up to 20 minutes.

Check the status of the components with:

juju status

Use the watch option to continuously track their status:

juju status --watch 5s

CKF is ready when all the applications and units are in active status. During the configuration process, some of the components may momentarily change to a blocked or error state. This is an expected behaviour that should resolve as the bundle configures itself.

Set credentials for your Kubeflow deployment:

juju config dex-auth static-username=admin
juju config dex-auth static-password=admin

Deploy Resource dispatcher

The Resource dispatcher operator is an optional component which distributes Kubernetes objects related to MLflow credentials to all user namespaces in Kubeflow. This enables all Kubeflow users to access the MLflow model registry from their namespaces. Deploy it as follows:

juju deploy resource-dispatcher --channel 2.0/stable --trust

See Resource Dispatcher for more details.

Then, relate the Resource dispatcher to MLflow as follows:

juju integrate mlflow-server:secrets resource-dispatcher:secrets
juju integrate mlflow-server:pod-defaults resource-dispatcher:pod-defaults

To deploy sorted MLflow models using KServe, create the required relations as follows:

juju integrate mlflow-minio:object-storage kserve-controller:object-storage
juju integrate kserve-controller:service-accounts resource-dispatcher:service-accounts
juju integrate kserve-controller:secrets resource-dispatcher:secrets

Integrate MLflow with Kubeflow dashboard

You can integrate the MLflow server with the Kubeflow dashboard by running:

juju integrate mlflow-server:ingress istio-pilot:ingress
juju integrate mlflow-server:dashboard-links kubeflow-dashboard:links

Now you should see the MLflow tab in the left-hand sidebar of your Kubeflow dashboard at:

http://10.64.140.43.nip.io/

Note

The address of your Kubeflow dashboard may differ depending on your setup. You can always check its URL by running:

microk8s kubectl -n kubeflow get svc istio-ingressgateway-workload -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

Integrate MLflow with Notebooks

In this section, you are going to create a Kubeflow Notebook server and connect it to MLflow.

1. Start by navigating to the MLflow dashboard at http://10.64.140.43.nip.io/. Use the username and password you configured in the Deploy Kubeflow section.

  1. Click on Start setup to setup the Kubeflow user for the first time and Select Finish to finish the process.

3. Now go back to the dashboard. From the left panel, choose Notebooks. Select +New Notebook.

At this point, name the Notebook and choose the desired image and resource limits. For example, you can use the following details:

  • Name: test-notebook.

  • Expand the Custom Notebook section and select the jupyter-tensorflow-full image.

Now, enable your Notebook server to access MLflow. Scroll down to Data Volumes -> Advanced options and from the Configurations dropdown, choose the following options:

  • Allow access to Kubeflow pipelines.

  • Allow access to MinIO.

  • Allow access to MLflow.

Click on Launch to launch the Notebook server.

Note

The notebook server may take a few minutes to initialise.

Once the Notebook server is ready, you’ll see it listed in the Notebooks table with a success status. At this point, select Connect to connect to it.

To ensure that MLflow is accessible, create a new notebook and add a cell with the following command:

!printenv | grep MLFLOW

Run the cell. This will print out MLFLOW_S3_ENDPOINT_URL and MLFLOW_TRACKING_URI variables, confirming MLflow is connected.

Run MLflow examples

To run MLflow examples on your newly created Notebook server, click on the source control icon in the leftmost navigation bar.

From the menu, choose Clone a Repository, and clone the following repository: https://github.com/canonical/charmed-kubeflow-uats.git.

This clones the charmed-kubeflow-uats repository onto the Notebook server. Enter the directory and navigate to the tests/notebooks sub-folder.

You will see the following folders:

  • mlflow-kserve: Demonstrates how to interact with MLflow and KServe from inside a notebook. This example trains a simple ML model, stores it in MLflow, deploys it with KServe from MLflow, and runs an inference service.

  • mlflow-minio: Demonstrates how to interact with MinIO from inside a notebook. This example shows how to use mounted MinIO secrets to access the MinIO object store.

  • mlflow: Demonstrates how to interact with MLflow from inside a notebook. This example uses a simple regression model that is stored in the MLflow registry.