New VMware Private AI Foundation with NVIDIA Enables Organizations to Prepare Their Business for Generative AI
VMware Inc. (NYSE: VMW) and NVIDIA (NASDAQ: NVDA) announced the expansion of their strategic partnership aimed at preparing hundreds of thousands of companies operating on VMware's cloud infrastructure for the era of artificial intelligence (A.I. ) generative.
Through the VMware Private AI Foundation with NVIDIA, organizations will be able to customize models and run generative AI applications, including intelligent chatbots, assistants, search and summarization. The platform will be a fully integrated solution with NVIDIA generative AI and accelerated computing software based on VMware Cloud Foundation and optimized for AI.
“Generative AI and multi-cloud are the perfect combination,” said Raghu Raghuram, CEO of VMware. “Customer data is everywhere, in your data centers, at the edge and in your clouds. Together with NVIDIA, companies will be able to run generative AI workloads adjacent to their data with confidence. Additionally, it will be possible to address your concerns about privacy, security and corporate data control.”
“Organizations around the world are racing to integrate generative AI into their businesses,” comments Jensen Huang, founder and CEO of NVIDIA. “Our expanded collaboration with VMware will provide hundreds of thousands of customers in financial services, healthcare, manufacturing and more with the software and full-stack computing they need to harness the potential of generative AI using custom-built applications with your own data.”
Full-stack computing to improve generative AI
To realize business benefits faster, companies are looking to speed the development, testing, and deployment of generative AI applications. McKinsey estimates that generative AI could add up to $4.4 billion annually to the global economy.
VMware Private AI Foundation with NVIDIA will enable enterprises to leverage this capability by customizing large language models, producing more secure and private templates for internal use, offering generative AI as a service to their users, and running inference workloads at scale with more security.
The solution is expected to include integrated AI tools to enable organizations to cost-effectively run proven, trained models on their private data. Using the VMware Cloud Foundation and the software NVIDIA AI Enterprise, the expected benefits of the platform will include:
- Privacy: Will enable customers to easily run AI services adjacent to anywhere they have data, with an architecture that preserves data privacy and enables secure access;
- Choice: Enterprises will have a broad choice of where to build and run their models, from NVIDIA NeMo™ to Llama 2 and beyond, including leading OEM hardware configurations and, in the future, public cloud and service provider offerings;
- Performance: Running on NVIDIA accelerated infrastructure will deliver performance equal to and even better than bare metal in some use cases, as proven by latest industry benchmarks;
- Data center scaling: GPU scaling optimizations in virtualized environments will enable AI workloads to scale up to 16 vGPUs/GPUs across a single virtual machine and multiple nodes to accelerate AI model fine-tuning and deployment generative
- Lower cost: will maximize the use of all computing resources across GPUs, DPUs, and CPUs to reduce overall costs and create a pooled resource environment that can be shared efficiently across teams;
- Accelerated Storage: VMware vSAN's Express Storage Architecture will provide performance-optimized NVMe storage and support for GPUDirect® storage over RDMA, enabling direct transfer of I/O from storage to GPUs without CPU involvement;
- Accelerated Networking: Deep integration between vSphere and NVIDIA NVSwitch™ technology will enable multi-GPU models to run without bottlenecks between GPUs;
- Rapid Deployment and Time to Value: The vSphere Deep Learning virtual machine (VM) images and image repository will enable rapid prototyping capabilities, providing a ready and stable solution image that includes pre-installed performance-optimized frameworks and libraries.
The platform will feature NVIDIA NeMo, a complete cloud-native framework included in NVIDIA AI Enterprise, the operating system of the NVIDIA AI platform, which allows companies to create, customize, and deploy generative AI models virtually anywhere. NeMo combines customization frameworks, protection toolkits, data curation tools, and pre-trained models to offer organizations an easy, cost-effective, and fast way to adopt generative AI.
To deploy generative AI in production, NeMo uses TensorRT for Large Language Models (TRT-LLM), which accelerates and optimizes inference performance on the latest LLMs on NVIDIA GPUs. With NeMo, the VMware Private AI Foundation with NVIDIA will enable companies to leverage their own data to build and run custom generative AI models on VMware's hybrid cloud infrastructure.
At VMware Explore 2023 in Las Vegas, NVIDIA and VMware highlighted how developers in the enterprise can use the new NVIDIA AI Workbench to extract community models such as Llama 2, available on Hugging Face, customizing them remotely, and deploying production-grade generative AI across VMware environments.
Broad ecosystem support for VMware Private AI Foundation with NVIDIA
VMware Private AI Foundation with NVIDIA will be supported by Dell Technologies, Hewlett Packard Enterprise and Lenovo, who will be among the first to offer systems that power enterprise LLM customization and inference workloads with NVIDIA L40S GPUs, NVIDIA BlueField®-3 DPUs and NVIDIA ConnectX®-7 SmartNICs.
The VMware Private AI Foundation with NVIDIA builds on the companies' decade-long partnership. The co-engineering work optimized VMware's cloud infrastructure to run NVIDIA AI Enterprise with performance comparable to bare metal. Mutual customers further benefit from the management and flexibility of resources and infrastructure provided by VMware Cloud Foundation.
Availability
VMware intends to launch VMware Private AI Foundation with NVIDIA in early 2024.