"MLOps on Kubernetes"

Kubeflow (https://www.kubeflow.org/) is an open source Kubernetes-native platform for developing, orchestrating, deploying, and running scalable and portable machine learning workloads. It is a cloud native platform based on Google’s internal machine learning (ML) pipelines. The project is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

In this book we take a look at the evolution of machine learning in enterprise, how infrastructure has changed, and then how Kubeflow meets the needs of the modern enterprise.

“This book is the go-to resource for enterprise deployment of Kubeflow from on-premise to the cloud. It will take you through how to think about Kubeflow on an operational level and then through the ways a team needs to think about integrating with their infrastructure for resources such as GPUs and Identity management.”

Jeremy Lewi, Cofounder of Kubeflow,
Principal Software Engineer, Primer

“Patterson, Katzenellenbogen, and Harris have pulled together a terrific book that describes not just the components of setting up a production-ready Kubeflow deployment, but the tactical steps necessary to do so on-premises or on any of the hyperscale clouds. This is an essential book for understanding how to bring Kubeflow from experimentation to enterprise-ready"

David Aronchick, Cofounder of Kubeflow

“A concise guide that covers planning, installing, and managing ML infrastructure across on-premises and cloud. This book provides a sorely needed step-by-step tutorial for using Kubeflow to support notebooks and autoscaled ML pipelines across hybrid cloud setups.”

Lak Lakshmanan, Director of Analytics
and AI Solutions, Google Cloud

“Kubeflow Operations is a great resource that dives deep into the operational aspects of running real-world Kubeflow and Kubernetes clusters. This book also includes best practices for managing Kubernetes security, multitenancy, traffic routing, service mesh, GPUs, autoscaling, and capacity planning.”

Chris Fregly, Developer Advocate, AI and
Machine Learning at AWS

“The Josh Patterson, Michael Katzenellenbogen, and Austin Harris book on Kubeflow should be a valuable roadmap for any data engineer or data scientist who is trying to build a modern data driven system. TBs/sec data streams, and online complex DL/ML-based decision models are becoming mainstream. With the availability of 400 Gb/s NDR Infiniband networking and PFLOPS CPU/GPU processing power on a single chip the role of the data scientist is often reduced to assembling available tools and monitoring the whole process rather than actively analyzing data and/or developing models. Data is driving both the feature generation and the model building. This is what this book is about.”

Alex Kozlov, Ph.D., Senior Data Scientist, NVIDIA

“Josh Patterson is a skilled practitioner who has helped many companies deploy and use Kubeflow successfully. He has also been deeply involved in the Kubeflow community for several years, giving him in-depth knowledge of the topic and a unique perspective not present in other Kubeflow guides. It is my pleasure to recommend this book.”

Hamel Husain, Staff Machine Learning Engineer, GitHub

“Kubeflow is a great way to consistently manage MLOps workflows across many clouds (including on-premises). Setting up and managing a hybrid Kubeflow is nontrivial and the authors do a great job at demystifying the whole process of explaining practical issues faced by MLOps engineers, starting from the guts of Kubeflow to deployment and operating in different clouds. This book fills a gap in the MLOps space very nicely and is highly recommended for both MLOps as well as the data scientist persona.”

Debo Dutta, VP Engineering Nutanix and Founding Member
and Independent Observer, MLCommons

“Kubeflow is quickly emerging as the open-source MLOps platform of choice in enterprise IT, and this book masterfully covers the ins and outs of Kubeflow operations. It should be required reading for all MLOps engineers.”

—Mike Oglesby, MLOps Engineer, NetApp

“Kubeflow is a favored development platform to simplify building and deploying AI capabilities into modern applications that utilize Kubernetes to scale and evolve efficiently. The Kubeflow Operations Guide provides valuable insights for planning, implementing, and operating Kubeflow.”

—Zeki Yasar, Principal Solutions Architect,
ePlus Technology, Inc.

“This book provides an exceptional deep dive into the operation of Kubeflow on-premise or via cloud providers. Kubeflow is a vital project in the machine learning engineering ecosystem and this publication provides a missing puzzle piece in the ecosystem: an excellent guide on how to set up and operate your machine learning engineering stack with Kubeflow or how to deploy machine learning models with KFServing effectively. I see this book as the go-to reference for machine learning or DevOps engineers wanting to understand a production Kubeflow setup. I wish the book would have been around when I set up my first clusters running Kubeflow; it would have saved me hours.”

—Hannes Hapke, Senior Machine Learning Engineer
at SAP Concur

“This book helped me to fully get my head around all the different parts of the Kubeflow system and understand what role Kubeflow plays in helping build a more reliable and reproducible data science deployment pipeline. From security to Jupyter implementation and on to deployment, this book was the guide that helped me see how the pieces fit together.”

—JD Long, RenaissanceRe
Co-author of "R Cookbook 2nd Edition"

“This book is a must-read guide for any DevOps team considering standardizing model deployments. Learn from the best and understand how machine learning works.”

—Axel Damian Sirota, Machine Learning Research Engineer

Want Help With Operating Kubeflow in Production?

Patterson Consulting's Managed Kubeflow Platform can help you put a customized Kubeflow system into production. We do a full identity-integration (e.g., Active Directory) and customize any security requirements. Then we use our best practices to rapidly onboard new user to accelerate their machine learning workflow production. Finally, you'll enjoy a managed platform without having to worry about scare MLOps talent, downtime, or a patching schedule.