Get started with Charmed MLflow and Kubeflow

Component

Version

MLflow

2.15

Kubeflow

1.9

This tutorial gets you started with Charmed MLflow integrated with Charmed Kubeflow (CKF).

Prerequisites

This guide assumes you are deploying Kubeflow and MLflow on a public cloud Virtual Machine (VM) with the following specifications:

  • Runs Ubuntu 20.04 (focal) or later.

  • Has at least 4 cores, 32GB RAM and 200GB of disk space available.

Also, your machine should meet the following requirements:

  • Has an SSH tunnel open to the VM with port forwarding and a SOCKS proxy. To see how to set this up, see How to setup SSH VM Access.

  • Runs Ubuntu 20.04 (focal) or later.

  • Has a web browser installed e.g. Chrome / Firefox / Edge.

In the remainder of this tutorial, unless otherwise stated, it is assumed you will be running all command line operations on the VM, through the open SSH tunnel. It’s also assumed you’ll be using the web browser on your local machine to access the Kubeflow and MLflow dashboards.

Deploy MLflow

Follow the steps in this tutorial to deploy MLflow on your VM: Get started with Charmed MLflow. Before moving on with this tutorial, confirm that you can now access the MLflow UI on http://localhost:31380.

Deploy Kubeflow bundle

Let’s deploy Charmed Kubeflow alongside MLflow. Run the following command to initiate the deployment:

juju deploy kubeflow --trust  --channel=1.9/stable

Set credentials for your Kubeflow deployment:

juju config dex-auth static-username=admin
juju config dex-auth static-password=admin

Deploy Resource Dispatcher

Next, deploy the resource dispatcher. The resource dispatcher is an optional component which distributes Kubernetes objects related to MLflow credentials to all user namespaces in Kubeflow. This means that all your Kubeflow users can access the MLflow model registry from their namespaces. To deploy the dispatcher, run the following command:

juju deploy resource-dispatcher --channel 2.0/stable --trust

This deploys the latest stable version of the dispatcher. See Resource Dispatcher on GitHub for more details. Now, relate the dispatcher to MLflow as follows:

juju integrate mlflow-server:secrets resource-dispatcher:secrets
juju integrate mlflow-server:pod-defaults resource-dispatcher:pod-defaults

To deploy sorted MLflow models using KServe, create the required relations as follows:

juju integrate mlflow-minio:object-storage kserve-controller:object-storage
juju integrate kserve-controller:service-accounts resource-dispatcher:service-accounts
juju integrate kserve-controller:secrets resource-dispatcher:secrets

Monitor The Deployment

Now, at this point, we’ve deployed MLflow and Kubeflow and we’ve related them via the resource dispatcher. But that doesn’t mean our system is ready yet: Juju will need to download charm data from CharmHub and the charms themselves will take some time to initialise.

So, how do you know when all the charms are ready, then? You can do this using the juju status command. First, let’s run a basic status command and review the output. Run the following command to print out the status of all the components of Juju:

juju status

Review the output for yourself. You should see some summary information, a list of Apps and associated information, and another list of Units and their associated information. Don’t worry too much about what this all means for now. If you’re interested in learning more about this command and its output, see the Juju Status command.

The main thing we’re interested in at this stage is the statuses of all the applications and units running through Juju. We want all the statuses to eventually become active, indicating that the bundle is ready. Run the following command to keep a watch on the components which are not active yet:

juju status --watch 5s

This will periodically run a juju status command.

Don’t be surprised if some of the components’ statuses change to blocked or error every now and then. This is expected behaviour, and these statuses should resolve by themselves as the bundle configures itself. However, if components remain stuck in the same error states, consult the troubleshooting steps below.

Note

It can take up to 15 minutes for all charms to be downloaded and initialised.

Integrate MLflow with Kubeflow Dashboard

You can integrate your charmed MLflow deployment with the Kubeflow dashboard by running following commands:

juju integrate mlflow-server:ingress istio-pilot:ingress
juju integrate mlflow-server:dashboard-links kubeflow-dashboard:links

Now you should see the MLflow tab in the left sidebar of your Kubeflow dashboard at:

http://10.64.140.43.nip.io/

Note

The address of your Kubeflow dashboard may differ depending on your setup. You can always check its URL by running:

microk8s kubectl -n kubeflow get svc istio-ingressgateway-workload -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

Integrate MLflow with Notebook

In this section, you are going to create a Kubeflow notebook server and connect it to MLflow.

First, to be able to use MLflow credentials in your Kubeflow notebook, go to the MLflow dashboard at http://10.64.140.43.nip.io/ and use the username and password you configured in the previous Deploy Kubeflow bundle section. For example, admin and admin.

Click on Start setup to setup the Kubeflow user for the first time.

Select Finish to finish the process.

Now a Kubernetes namespace is created for your user.

Now go back to the dashboard. From the left panel, choose Ǹotebooks. Select +New Notebook.

At this point, name the notebook as you prefer, and choose the desired image and resource limits. For example, you can use the following details:

  1. Name: test-notebook.

  2. Expand the Custom Notebook section and for image, select kubeflownotebookswg/jupyter-tensorflow-full:v1.9.0.

Now, to allow your notebook server access to MLflow, you need to enable some configuration options. Scroll down to Data Volumes -> Advanced options and from the Configurations dropdown, choose the following options:

  1. Allow access to Kubeflow pipelines.

  2. Allow access to MinIO.

  3. Allow access to MLflow.

Note

Remember we related Kubeflow to MLflow earlier using the resource dispatcher? This is why we’re seeing the MinIO and MLflow options in the dropdown!

Great, that’s all the configuration for the notebook server done. Hit the Launch button to launch the notebook server. Be patient, the notebook server will take a little while to initialise.

When the notebook server is ready, you’ll see it listed in the Notebooks table with a success status. At this point, select Connect to connect to the notebook server.

When you connect to the notebook server, you’ll be taken to the notebook environment in a new tab. Because of our earlier configurations, this environment is now connected to MLflow in the background. This means the notebooks we create here can access MLflow. Cool!

To test this, create a new notebook and paste the following command into it, in a cell:

!printenv | grep MLFLOW

Run the cell. This will print out two environment variables MLFLOW_S3_ENDPOINT_URL and MLFLOW_TRACKING_URI, confirming MLflow is indeed connected.

Great, we’ve launched a notebook server that’s connected to MLflow! Now let’s upload some example notebooks to this server to see MLflow in practice.

Run MLflow examples

To run MLflow examples on your newly created notebook server, click on the source control icon in the leftmost navigation bar.

From the menu, choose the Clone a Repository option.

Now insert this repository address https://github.com/canonical/charmed-kubeflow-uats.git.

This clones a whole charmed-kubeflow-uats repository onto the notebook server. The cloned repository is a folder on the server, with the same name as the remote repository. Go inside the folder and after that, choose the tests/notebooks sub-folder.

There you find following folders:

  • mlflow-kserve: demonstrates how to talk to MLflow and KServe from inside a notebook. This example trains a simple ML model, stores it in MLflow, deploys it with KServe from MLflow and runs inference.

  • mlflow-minio: demonstrates how to talk to MinIO from inside a notebook. This example shows how you can use mounted MinIO secrets to talk to MinIO object store.

  • mlflow: demonstrates how to talk to MLflow from inside a notebook. The example uses a simple regression model which is stored in the MLflow registry.

Go ahead, try those notebooks out for yourself! You can run them cell by cell using the run button, or all at once using the double chevron >>.