Scaling AI at Inference: The Road to Agent-Driven ROI

Home

Roman Chernin joins Patrick Moorhead and Daniel Newman to discuss how AI infrastructure is shifting from training to inference, why Nebius built Token Factory to optimize system-level performance, and how agent-driven ROI will define AI success in 2026 and beyond.

AI has moved beyond model training, inference is the new frontier.

This Six Five Webcast features Patrick Moorhead and Daniel Newman, joined by Roman Chernin, Co-founder & Chief Business Officer at Nebius, to explore how AI infrastructure is evolving from massive training clusters to production-grade inference systems built for agents, open-source models, and real ROI.

Nebius positions itself as an AI-specialized cloud, purpose-built to optimize inference workloads at scale. As AI shifts from research labs to product companies and enterprise agents, performance, cost efficiency, and system-level orchestration have become the defining battleground.

Key Takeaways:

🔹 The shift from training to inference: Why budgets, architectures, and customer priorities are changing.

🔹 The Nebius Token Factory: How full-stack optimization across hardware, software, and orchestration improves unit economics.

🔹 Open-source in the enterprise: Why flexibility, tunability, and cost control matter as much as frontier intelligence.

🔹 Agent-driven ROI: Why 2026 will demand measurable business outcomes, not just model benchmarks.

🔹 Performance beyond GPUs: How CPUs, workload orchestration, caching, quantization, and stack optimization tie in to define success.

Nebius combines next-generation silicon access with a purpose-built cloud stack and white-glove technical support to help customers ship AI products that are fast, affordable, and compliant at scale.

The next phase of AI won’t be defined by a model, it will be defined by who can run inference most efficiently.

To learn more about how Nebius is scaling AI for real-world inference and agent-driven ROI, read about it here and explore the full solution: HERE

Watch the full webcast at sixfivemedia.com or subscribe to our YouTube channel so you never miss an episode.
‍

Disclaimer: Six Five Media is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded, and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.

Transcript

MORE VIDEOS

Modernizing Virtualization and AI Inference with HPE & Intel

A 2.7x reduction in cost per token and a ruggedized edge device capable of running 80 billion parameter models are reshaping how enterprises think about AI infrastructure economics. Justin McGarry, VP and GM of Compute and AI Infrastructure Software at HPE, and Justin Christiansen, GM and HPE Global Sales Director at Intel, join Six Five at HPE Discover 2026 to break down how virtualization savings, workload-specific CPU and GPU architecture, and component supply pressure are shaping enterprise AI decisions right now.

Qualcomm's Data Center Debut, OpenAI's Jalapeño, and the Memory-as-Strategic Infrastructure Debate | The Six Five Pod Ep. 310

On Episode 310 of The Six Five Pod, Patrick Moorhead and Daniel Newman unpack the biggest stories from the week, including insights from Qualcomm Investor Day 2026, OpenAI and Broadcom's Jalapeño AI chip, Anthropic's Micron partnership, SpaceX's massive Reflection AI compute deal, Sakana AI's new Fugu orchestrator, and why memory is emerging as a critical layer of AI infrastructure. Plus, Bulls & Bears covers NVIDIA's $25B bond offering, Apple's MacBook price increases, Micron's record quarter, and Cerebras' first earnings as a public company.

‍

HPE's Chief Architect: The Tech That Will Define Enterprise Computing by 2030

As organizations race to operationalize AI, quantum computing, intelligent systems, and next-generation infrastructure are already converging to redefine what becomes possible by 2030. Kirk Bresniker, HPE Fellow, Vice President, and Chief Architect at HPE, joins Six Five at HPE Discover 2026 to examine how research, ecosystem collaboration, and hybrid computing architectures are unlocking the breakthroughs enterprises will need to compete over the next decade.

Other Categories

CYBERSECURITY

Threat Intelligence: Insights on Cybersecurity from Secureworks

Alex Rose from Secureworks joins Shira Rubinoff on the Cybersphere to share his insights on the critical role of threat intelligence in modern cybersecurity efforts, underscoring the importance of proactive, intelligence-driven defense mechanisms.

HP Launches World’s First Business PCs to Protect Against Quantum Hacks - The Six Five On the Road

On this episode of the Six Five - On the Road, hosts Patrick Moorhead and Daniel Newman are joined by HP's Ian Pratt, Global Head of Security for Personal Systems.

What is Autonomous Endpoint Management?

Autonomous Endpoint Management is a framework designed to unify IT operations and security teams on a single platform through real-time control and visibility.

QUANTUM

Quantum in Action: Insights and Applications with Matt Kinsella

Quantum is no longer a technology of the future; the quantum opportunity is here now. During this keynote conversation, Infleqtion CEO, Matt Kinsella will explore the latest quantum developments and how organizations can best leverage quantum to their advantage.

Accelerating Breakthrough Quantum Applications with Neutral Atoms

Our planet needs major breakthroughs for a more sustainable future and quantum computing promises to provide a path to new solutions in a variety of industry segments. This talk will explore what it takes for quantum computers to be able to solve these significant computational challenges, and will show that the timeline to addressing valuable applications may be sooner than previously thought.