SoftBank’s Infrinia AI Cloud OS for GPU cloud services

Japanese multinational investment holding company SoftBank has launched Infrinia AI Cloud OS, a software suite custom-designed for AI data centers. Infrinia AI Cloud OS designed by the Infrinia team enables data center operators to provide Kubernetes-as-a-service (KaaS) in multi-tenant settings and offer inference-as-a-service (Inf-aaS). This allows customers to access LLM through simple APIs that can be added directly to an operator’s existing GPU cloud offering.

Infrinia Cloud OS meets growing global demands

The software suite is expected to reduce total cost of ownership (TCO) and simplify day-to-day complexity, especially compared to in-house developed options and custom-built assemblies. Ultimately, Infrinia Cloud OS promises to accelerate the deployment of GPU cloud services while supporting every stage of the AI ​​lifecycle, from training models to real-time use.

Initially, SoftBank plans to integrate Infrinia Cloud OS into its existing GPU cloud offerings before deploying the software suite globally to overseas data centers and cloud platforms in the future.

Demand for GPU-powered AI is growing rapidly in many industries, from science and robotics to generative AI. As complex user needs also grow, it puts demand on GPU cloud service providers.

Some users demand fully managed systems with “bare-metal abstract GPU servers”, while others need affordable AI without relying directly on GPU management. Others are looking for a more advanced setup where AI model training is centralized and inference is implemented at the edge.

Infrinia AI Cloud OS was designed to meet these challenges, maximize GPU performance, and simplify the management and deployment of GPU cloud services.

Capabilities of Infrinia Cloud OS

With KaaS capabilities, SoftBank’s latest software suite is able to automate every layer of the underlying infrastructure, from low-level server setup to storage, networking and Kubernetes itself.

It can also reconfigure hardware connections and memory as needed, allowing GPU clusters to be quickly created, modified, or removed to suit different AI workloads. Automated node allocation based on how close GPUs and NVIDIA NVLink domains are connected helps reduce latency and improves GPU-to-GPU bandwidth for larger distributed workloads. Infrinia’s Inf-aaS component was designed to allow users to easily implement inference tasks, enabling faster and more scalable AI model inference through managed services.

By simplifying operational complexities and reducing total cost of ownership, Infrinia AI Cloud OS is positioned to accelerate the adoption of GPU-based AI infrastructure in various sectors around the world.

(Image source: “SoftBank.” by MIKI Yoshihito. (#mikiyoshihito) is licensed under CC BY 2.0. )

Want to learn more about Cloud Computing from industry leaders? Check out the Cyber ​​​​Security & Cloud Expo in Amsterdam, California and London. The comprehensive event is part of TechEx and is co-located with other leading technology events. Click here for more information.

CloudTech News is powered by TechForge Media. Explore other upcoming business technology events and webinars here.

Leave a Comment