SambaNova and Intel have extended their collaboration with a heterogeneous hardware solution that “combines GPUs for prefill, Intel Xeon 6 processors as host and ‘action’ CPUs, and SambaNova RDUs (reconfigurable dataflow units) for decode,” according to a press release.
By assigning each step to hardware suited to each, the companies claim higher quality, faster AI responses powerful enough for scaled agentic workloads.
Rodrigo Liang, CEO and co‑founder of SambaNova Systems said: “Agentic AI is moving into production – and the winning pattern we’re seeing is GPUs to start the job, Intel Xeon 6 to run it, and SambaNova RDUs to finish it fast. Together with Intel, we’re giving customers a blueprint they can deploy in existing air‑cooled data centres, with x86 coverage for the coding agents and tools they already use today.”
According to Kevork Kechichian, executive vice president and general manager of the Data Centre Group at Intel, future workloads will “require a heterogeneous mix of computing. The data centre software ecosystem is built on x86, and it runs on Xeon.”
Banghua Zhu, co-founder and CTO at AI infrastructure startup, RadixArk, said, “Production inference is moving toward heterogeneous hardware – no single chip type is optimal for every stage of an agentic workflow.”
The architecture has been engineered jointly by the two companies, built around Intel Xeon 6 processors and SambaNova RDUs. The SN50 RDU, SambaNova’s fifth-generation AI inference processor, was designed to “transform the tokenomics of inference”, delivering “high-throughput, low-latency decode for large language models,” the company states. The Xeon 6 chip supplies memory bandwidth, on-die accelerators, and PCIe lane density.
SambaNova’s testing discovered Intel Xeon 6 processors performed up to 50% faster than Arm-based server CPUs and provided a faster performance by up to 70% in vector database operations, it says.
Xeon 6 acts as host CPU and the system control plane, managing agentic task coordination, tool and API execution, system-level behaviour, and workload distribution. It also gathers and executes code, and confirms whether proposed actions can be deemed trustworthy.
Current GPU-only architectures need specialised data centres, liquid cooling and custom power infrastructure. Installations generate vast amounts of heat and consume massive amounts of power. In contrast, the SambaNova and Intel solution will run in ‘standard’ data centres, without a need for infrastructure upgrades.
(Image source: Pixabay, under licence.)
Want to learn more about Cloud Computing from industry leaders? Check out Cyber Security & Cloud Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.
CloudTech News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.
