Latest Podcasts

Scaling Llama-3 Training on 512 H100 GPUs
Learn how we optimized NCCL parameters and cluster topology to achieve linear scaling performance.

Why we switched to k0s for our control plane
A deep dive into why we chose k0s for its minimalist footprint and high-availability architecture.

Innovativ AI: A Customer Story
How a mid-sized research lab reduced their training costs by 45% using Klabbble’s automated cluster scaling.