It’s been a while since we last checked in on Kubeflow, the open source option for making ML stacks easier. Kubeflow 0.3 boasts a number of technical improvements, including easier deployment and customization of components and better multi-framework support.
Kubeflow 0.3 features
Declarative and extensible deployment
Thanks to a new deployment command line script; kfctl.sh introduced in Kubeflow 0.3, the deployment experience is even more simple. In short, Kubeflow can now easily be configured and deployed like this:
${KUBEFLOW_REPO}/scripts/kfctl.sh init ${KFAPP} `--platform gcp --project ${PROJECT}` cd ${KFAPP} ${KUBEFLOW_REPO}/scripts/kfctl.sh generate platform ${KUBEFLOW_REPO}/scripts/kfctl.sh apply platform ${KUBEFLOW_REPO}/scripts/kfctl.sh generate k8s ${KUBEFLOW_REPO}/scripts/kfctl.sh apply k8s
There’s more to this tool though; it also allows Kubernetes resources and non-K8s resources to be created and deployed consistently. The team has separated the deployment process into two steps:
- creating declarative configs describing your deployment and
- applying these configs.
This makes it easy to customize your deployment and check it into source control for repeatable deployments, as explained in the blog post announcing the new release.
More options for local development
- Minikube deployment now provides a single command shell script based deployment.
- You can also use MicroK8s to easily run Kubeflow on your laptop.
Inference capabilities: Improvements
First of all, you can now do batch inference with GPUs (but non distributed) for TensorFlow using Apache Beam. Why Apache Beam? Because you’ll find it easier to write batch and streaming data processing jobs that run on a variety of execution engines.
The team has been trying to make it easy to run TFServing in production and their work has paid off; a Liveness probe has been added and fluentd is used to to log request and responses to enable model retraining.
They are also offering more options for online prediction using both CPUs and GPUs by taking advantage of the NVIDIA TensorRT Inference Server. The TensorRT Inference Server is a containerized, production-ready AI inference server which maximizes utilization of GPU servers by running multiple models concurrently on the GPU and supports all the top AI frameworks.
Hyperparameter tuning
Producing high-quality models can sometimes be a challenge and the main culprit for that is the difficulty of finding optimal hyperparameters. The team is addressing this issue; they just introduced a new K8s custom controller, StudyJob, that allows a hyperparameter search to be defined using YAML.
Have a look at this manifest if you want to see how easy it is to tune your models using the hyperband algorithm.
For the entire list of updates, visit the 0.3 changelog.
SEE ALSO: Kubeflow: Bringing together Kubernetes and machine learning
Why switch to Kubeflow?
Kubeflow is intended to make ML easier for Kubernetes users. How? By letting the system take care of the details (within reason) and support the kind of tooling ML practitioners want and need.
Kubernetes already has:
- Easy, repeatable, portable deployments on a diverse infrastructure (laptop <-> ML rig <-> training cluster <-> production cluster)
- Deploying and managing loosely-coupled micro services
- Scaling based on demand
Kuberflow aims to give users simple manifests so they can have an easy to use ML stack anywhere Kubernetes is already running. Plus, it should self-configure based on the cluster it deploys into.
The post Kubeflow 0.3 brings better multi-framework support appeared first on JAXenter.
Source : JAXenter