Why Tzafon Chose Google Cloud for Next-Gen Agentic AI

Tzafon is a research firm building scalable compute systems and advancing machine intelligence, with offices in San Francisco, Stockholm & Tel Aviv. Our vision is to expand the lightcone of consciousness by expanding the frontiers of machine intelligence. Our team of engineers and scientists, featuring IOI and IMO medalists, PhDs, and alumni from leading tech companies, builds infrastructure and trains models for swarms of agents designed to automate work across real-world environments.

When founding Tzafon, our goal was clear: empower consumers to delegate entire workflows, not just isolated tasks, to swarms of autonomous AI agents. After rigorous evaluation, Google Cloud emerged as our ideal partner. Here's why:

1. AI-Optimized Infrastructure

Our multi-agent models demand intense compute power, routinely using petaflop-hours before lunch. Google Cloud provides immediate, elastic access to up to thousands of GPUs within minutes, thanks to its advanced workload scheduler, perfectly aligning with our rapid research cycles.

2. Kubernetes Tailored for Machine Learning

We run complex workflows from data ingestion and reward-model training to reinforcement-learning deployments entirely through GKE Autopilot. Google's Kubernetes engine intelligently manages GPU resources and network placement, simplifying operational complexity far beyond other solutions we tested.

3. Exabyte-Scale Data Management

Our agents continuously learn from petabytes of interaction traces and telemetry data. BigQuery’s serverless capabilities handle this massive influx of data uninterrupted, in real-time. Additionally, BigQuery Omni integrates datasets from AWS and Azure without egress fees or schema headaches.

4. Startup-Speed Partnership

Google Cloud’s Startup Success team provided exceptional support, credits, strategic introductions, and direct product-team communication. When we needed specialized GKE node pools with ultra-fast local SSD storage, Google delivered custom solutions in days.

Looking Forward

Over the next year, we'll train large scale models using Google Cloud. Which we’ll use to serve all of our users, running entirely on GCP.