Red Hat has expanded its collaboration with Amazon Web Services (AWS). The partnership supports enterprise-grade genAI on AWS using Red Hat AI and AWS AI hardware.
The partnership aims to give organisations flexibility in running high-performance AI inference at scale, independent of underlying hardware.
The growth of genAI is prompting organisations to reassess their IT infrastructure.
The collaboration combines Red Hat’s platform capabilities with AWS cloud infrastructure and AI chipsets, including AWS Inferentia2 and AWS Trainium3.
The Red Hat AI Inference Server, powered by vLLM, will run on these chips, providing a common inference layer that supports any genAI model.
According to Red Hat, this setup can deliver up to 30–40% better price performance compared with GPU-based Amazon EC2 instances.
In collaboration with AWS, Red Hat developed an AWS Neuron operator for Red Hat OpenShift, OpenShift AI, and OpenShift Service on AWS.
This provides a supported path to run AI workloads with AWS accelerators.
Red Hat also released the amazon.ai Certified Ansible Collection for Red Hat Ansible Automation Platform.
It helps orchestrate AI services on AWS.
The companies are contributing upstream to optimise an AWS AI chip plugin for vLLM.
Red Hat, as the top commercial contributor to vLLM, aims to accelerate AI inference and training.
vLLM also forms the basis of llm-d, an open source project for scalable AI inference, which is now included in Red Hat OpenShift AI 3.
Featured image credit: Edited by Fintech News Switzerland, based on image by drobotdean via Freepik