Volcano Framework

Volcano : Framework for HPC Workloads in Kubernetes

A framework dedicated to run AI/ML and Big Data Workloads in Kubernetes

Mohammad Asif Siddiqui

Last updated on Nov 6, 2019 General, Announcements, Tech

Volcano Framework

Volcano : Framework for HPC Workloads in Kubernetes

A framework dedicated to run AI/ML and Big Data Workloads in Kubernetes

Mohammad Asif Siddiqui

Last updated on Nov 6, 2019 General, Announcements, Tech

Volcano is system for runnning high performance workloads on Kubernetes. It provides a suite of mechanisms currently missing from Kubernetes that are commonly required by many classes of high performance workload including:

Machine learning/Deep learning,
BioInformatics/Genomics, and
Other “big data” applications.

These types of applications typically run on generalized domain frameworks like Tensorflow, Spark, PyTorch, MPI, etc, which Volcano integrates with.

Some examples of the mechanisms and features that Volcano adds to Kubernetes are:

Scheduling extensions, e.g:

Co-scheduling
Fair-share scheduling
Queue scheduling
Preemption and reclaims
Reservartions and backfills
Topology-based scheduling

Job management extensions and improvements, e.g: - Multi-pod jobs - Improved error handling - Indexed jobs - Others (in upstream)

Optimizations for throughput, round-trip latency, etc.
Volcano builds upon a decade and a half of experience running a wide variety of high performance workloads at scale using several systems and platforms, combined with best-of-breed ideas and practices from the open source community.

Twitter: https://twitter.com/volcano_sh

Slack: https://volcano-sh.slack.com

Website: https://volcano.sh

Documentation: https://volcano.sh/docs/

GitHub: https://github.com/volcano-sh/volcano

Volcano