Multinational technology company NVIDIA recently announced that its NeMo microservices will now be generally available to help enterprise IT build AI teammates using data flywheels to scale employee productivity. These microservices provide an end-to-end platform for building AI agents to scale employee productivity with data flywheels using human and AI feedback, and are informed by inference and business data.
“Our view is that AI teammates will be helping over a billion knowledge workers across businesses, industries, geographies, and languages, and get more work done,” said Joey Conway, Senior Director of Generative AI Software for Enterprise. “Our partner ecosystem has already started with these AI teammates in production.”
Using a data flywheel allows organizations to onboard AI agents as digital teammates that tap into user interactions and AI-generated data created during inference to continuously improve model performance, thus turning usage into insight and insight into action.
Among the key microservices generally available include:
- The NeMo Customizer: Accelerates LLM fine-tuning and delivers up to 1.8x higher training throughput using popular post-training techniques including supervised fine-tuning and low-rank adaptation.
- The NeMo Evaluator: Simplifies the evaluation of AI models and workflows on custom and industry benchmarks with five application programming interface calls.
- The NeMo Guardrails: Improve compliance protection up to 1.4x with just half a second of additional latency, assisting IT teams with implementing robust safety and security measures that align with organizational polices and guidelines.
These microservices can be used alongside NeMo Retriever and NeMo Curator to simplify building, optimizing, and scaling AI agents through custom enterprise data flywheels.
Additionally, organizations can build data flywheels with NeMo Retriever microservices using NVIDIA AI Data Platform Offerings from NVIDIA-Certified Storage such as DDN, Dell Technologies, Hewlett Packard Enterprise, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, VAST Data, and WEKA.
NeMo microservices helping partners grow
Partners like AT&T and Cisco have seen significant improvements with NeMo microservices, including a 40 percent accuracy boost for AI agents by AT&T. Cisco’s Outshift team, partnering up with Galileo, also used the microservices to power a coding assistant that delivers 40 percent fewer tool selection errors and achieves up to 10 times faster response times.
“NVIDIA partners and industry pioneers are using NeMo microservices to build responsive AI agent platforms so that digital teammates can help get more done,” NVIDIA said. “Working with Arize and Quantiphi, AT&T has built an advanced AI-powered agent using NVIDIA NeMo, designed to process a knowledge base of nearly 10,000 documents, refreshed weekly. The scalable, high-performance AI agent is fine-tuned for three key business priorities: speed, cost efficiency, and accuracy, which are all increasingly critical as adoption scales.”
Additionally, Nasdaq is boosting its Nasdaq GenAI Platform with NeMo Retriever microservices and NVIDIA NIM microservices. The NeMo Retriever was able to enhance the platform’s search capabilities, which led to up to 30 percent improved accuracy and response times and savings on cost.
“Our view is that every agent or teammate will need a data filo,” Conway said. “It helps them add capabilities and skills by learning from business knowledge and customer interactions “NeMo microservices is the most efficient and easiest way to bring this data filo to the AI teammates and agents, and scale the AI workforce productivity.”
Model and partner ecosystem support for NeMo microservices
These microservices can support a broad range of open models, such as Llama, Microsoft Phi small language models, Google Gemma, Mistrial, and Llama Nemotron Ultra. Meta is currently using NVIDIA NeMo microservices through new connectors for Meta Llamastack.
“With Llamastack integration, agent builders can implement data flywheels powered by NeMo microservices,” said Raghotham Murthy, Software Engineer, GenAI at Meta. “This allows them to continuously optimize models to improve accuracy, boost efficiency, and reduce TCO.”
Other software providers—including Cloudera, Datadog, Dataiku, DataRobot, DataStax, SuperAnnotate, and Weights & Biases—integrated NeMo microservices into their platforms, and developers can use NeMo microservices in popular AI frameworks such as CrewAi, Haystack by deepset, LangChain, LlamaIndex, and Llamastack.
NVIDIA has been committed to developing agentic AI to more enterprises and regulated industries. Read more about the organization’s partnership with Google Cloud to develop secure, on-prem AI with Gemini models.