Recently, Rafay Systems and Netris announced a strategic partnership to accelerate the adoption of GPU cloud infrastructure and rapidly scale AI capabilities.
GPU platforms available through partnership
Rafay, a provider of Platform-as-a-Service capabilities for self-service compute consumption, and Netris, a networking automation provider, will collaborate to accelerate the consumption and monetization of GPU-based infrastructure with self-service workflows for model training, fine-tuning, and inference use cases.
This collaboration will allow GPU cloud providers to transform raw GPU hardware into fully operational, enterprise-ready cloud platforms with self-service capabilities in weeks. This will reduce the time-to-market and eliminate the complexity of building AI infrastructure.
“Many first-generation GPU clouds still require human-driven backend processes to meet customer needs,” said Hasseb Budhani, CEO and co-founder of Rafay Systems. “Our partnership with Netris removes the traditional barriers of manual provisioning across the software and hardware layers of the AI infrastructure stack, allowing enterprises to instantly access and utilize GPU resources without the error-prone, human-in-the-loop processes that result in a less-than-ideal experience for consumers of AI infrastructure.”
With the combination of Rafay’s enterprise-grade workflow management, virtualization, and substrate management with Netris’ advanced network lifecycle automation, abstraction, and multi-tenancy, the organizations will create secure, isolated computing environments where multiple enterprises can leverage shared GPU cloud infrastructure without compromising performance, security, or data privacy.
“Big Three cloud providers invested a decade of top talent’s time to develop the proprietary software for infrastructure automation and multi-tenancy that enables their cloud offerings. Netris has built the same for everyone else covering the networking piece, while Rafay did the same for the compute piece,” said Alex Saroyan, CEO and co-founder of Netris. “Rafay and Netris together become a perfect turnkey solution for new and upcoming generations of GPU-based AI cloud providers.”
Benefits of the partnership
Among the benefits that GPU cloud providers can attain through this partnership include:
- SKU automation and management: Programmatically define SKUs consisting of GPUs, CPUs, AI applications, or a combination thereof.
- Self-service portals for developers and data scientists: Cloud providers can provide self-service portals for developers and data scientists to consume compute and AI applications on demand.
- Enterprise-grade user management: Cloud providers can support enterprise single sign-on (SSO) and role-based access control (RBAC) for secure consumption and deep audit trails that can be exported to enterprise SIEMs.
- Enterprise administration: Cloud providers can sell blocks of compute to enterprises and empower governance over their allocated compute blocks through persona-specific configuration management portals and dashboards.
- Kubernetes cluster lifecycle management & platform management: Easily manage Kubernetes clusters in data centers or public cloud environments. Customers can also deliver secure, multi-tenant environments to meet enterprise security requirements through features like virtual clusters, network segmentation, RBAC, secure remote access, policy enforcement, quota enforcement, and immutable auditing.
- Virtual machine provisioning and lifecycle management: Cloud providers can manage several virtual machines in scenarios where customers prefer to use off-the-shelf AI applications that require virtual machines as the substrate.
- Usage and chargeback data: Gain turnkey access to chargeback data, which can be easily integrated into billing systems for post-paid use cases.
- Underlay (network-level) automation: This feature supports customers who require many GPUs on demand by programmatically configuring the underlying networking layer to ensure hardware-level multi-tenancy and high-level performance.
- Network lifecycle management and automation: Streamline operations and management of the underlying east-west and north-south switch fabrics, gaining guaranteed stability and scalability. This allows cloud providers to cut time to market by eliminating guesswork and in-house development, leveraging built-in automation of NVIDIA networking guidelines and rigorously tested best practices.
- Cloud networking functions: Essential cloud networking functions can be offered that end users expect to see as part of a competitive cloud offering, including Internet gateways, NAT gateways, Elastic Load Balancers, and Direct connect, among others.
- Support: Gain 24/7/365 support from both Rafay and Netris to handle issues and ongoing questions. Each organization has 6+ years of experience supporting live cloud provider customers based on NVIDIA and other networking technologies.
Cloud spend has grown significantly over the past few years to meet the demands of AI solutions in the channel. Read more about how MSPs can save on costs without losing the agility of cloud.